[ 
https://issues.apache.org/jira/browse/TIKA-1106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris A. Mattmann updated TIKA-1106:
------------------------------------
    Issue Type: New Feature  (was: Wish)

> CLAVIN Integration
> ------------------
>
>                 Key: TIKA-1106
>                 URL: https://issues.apache.org/jira/browse/TIKA-1106
>             Project: Tika
>          Issue Type: New Feature
>          Components: parser
>    Affects Versions: 1.3
>         Environment: All
>            Reporter: Adam Estrada
>            Assignee: Chris A. Mattmann
>            Priority: Minor
>              Labels: entity, geospatial, new-parser
>             Fix For: 1.8
>
>
> I've been evaluating CLAVIN as a way to extract location information from 
> unstructured text. It seems like meshing it with Tika in some way would make 
> a lot of sense. From CLAVIN website...
> {quote}
> CLAVIN (*Cartographic Location And Vicinity INdexer*) is an open source 
> software package for document geotagging and geoparsing that employs 
> context-based geographic entity resolution. It combines a variety of open 
> source tools with natural language processing techniques to extract location 
> names from unstructured text documents and resolve them against gazetteer 
> records. Importantly, CLAVIN does not simply "look up" location names; 
> rather, it uses intelligent heuristics in an attempt to identify precisely 
> which "Springfield" (for example) was intended by the author, based on the 
> context of the document. CLAVIN also employs fuzzy search to handle 
> incorrectly-spelled location names, and it recognizes alternative names 
> (e.g., "Ivory Coast" and "Côte d'Ivoire") as referring to the same geographic 
> entity. By enriching text documents with structured geo data, CLAVIN enables 
> hierarchical geospatial search and advanced geospatial analytics on 
> unstructured data.
> {quote}
> There was only one other instance of the word "clavin" mentioned in the ASF 
> jira site so I thought it was definitely worth posting here.
> https://github.com/Berico-Technologies/CLAVIN



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to