Russell Jurney created PIG-2803:
-----------------------------------

             Summary: Include Wonderdog (ElasticSearch Integration) in contrib/
                 Key: PIG-2803
                 URL: https://issues.apache.org/jira/browse/PIG-2803
             Project: Pig
          Issue Type: New Feature
          Components: tools
    Affects Versions: 0.10.0, 0.11, 0.10.1
         Environment: contrib/ github
            Reporter: Russell Jurney
            Assignee: Russell Jurney
            Priority: Critical
             Fix For: 0.10.1


I propose to add Wonderdog to Pig contrib/

Wonderdog is an Apache 2.0 licensed project that adds Hadoop and Pig 
integration for ElasticSearch. This lets you index any Pig relation with a 
single UDF call, which is very powerful. Both writing searchable indexes and 
loading based on search queries is supported.

More information on Wonderdog is available at 
https://github.com/infochimps-labs/wonderdog and a great introduction to 
ElasticSearch is available at 
http://www.elasticsearchtutorial.com/elasticsearch-in-5-minutes.html

Wonderdog broke in Pig 0.10.0, and was patched to work here: 
https://github.com/infochimps-labs/wonderdog/pull/9 Even still, there is the 
issue of Pig creating schema files when storing and loading JSON that must be 
manually removed to make Wonderdog go.

Moving forward, I would like the Pig project to maintain Wonderdog in contrib/ 
and verify that it works with each version increment. Wonderdog is an 
incredibly useful library that is license compatible with Pig itself. Along 
with ElasticSearch, it adds the ability for any user to index his Pig relations 
and to load subsets of data by pushing search queries down to ElasticSearch.

I use Wonderdog in production and in my book, so I volunteer to do the 
maintenance on contrib/wonderdog.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to