I've just created a new JIRA issue for the spatial functionality.
https://issues.apache.org/jira/browse/PIG-3344
This issue is all about the new datatype which is the only thing that needs
to be changed internally in Pig in this phase. Pigeon is already working
with the ESRI library but it converts between binary representation and
Geometry class back and forth. Once the new datatype is added, we can
change Pigeon to work with this datatype too. We can still keep the current
conversion functionality as it allows the system to automatically perform
the conversion from the bytearray datatype as it adds the autodetect
functionality when a column is not given a type in the schema.

I don't know if I should provide a patch to this issue myself or there is
someone else who can work on it. I can of course do it but I think it will
take me some time to finish as I'm not yet familiar with the internals of
Pig. Someone who is familiar with the parser would definitely make a better
job here. I can focus on Pigeon and add more spatial functions there so
that we can have a plenty of functions once the new datatype is added. I'm
open to both solutions but I'm just checking with you.

Thanks
Ahmed

Best regards,
Ahmed Eldawy


On Wed, May 29, 2013 at 12:17 PM, Russell Jurney
<russell.jur...@gmail.com>wrote:

> Awesome. This would be a great addition to Pig. Please create a JIRA.
>
> Russell Jurney http://datasyndrome.com
>
> On May 29, 2013, at 8:51 AM, Ahmed Eldawy <aseld...@gmail.com> wrote:
>
> > Hi all,
> >
> > Nick has pointed out to me an alternative GIS package that can replace
> JTS.
> > ESRI has recently released a GIS
> > package<https://github.com/Esri/geometry-api-java>under Apache
> > license. I changed Pigeon to work with that new package. I
> > think it could be easier now to integrate this work with main branch of
> > Apache Pig. I will go on with the current project and add more spatial
> > functionality. We can then add a new datatype to Apache and link it to
> > those functions.
> >
> > ESRI package contains a class OGCGeometry
> > <
> http://esri.github.io/geometry-api-java/javadoc/com/esri/core/geometry/ogc/OGCGeometry.html
> >which
> > can be linked to a new datatype 'Geometry'. Do you think we can rely on
> the
> > new package and integrate the work with Apache Pig?
> >
> > On May 23, 2013 11:40 PM, "Ahmed Eldawy" <aseld...@gmail.com> wrote:
> >
> >> Hi all,
> >>  Thanks for your help. I've started the project with a minimal
> >> functionality as a start. It's currently hosted in github. It is
> licensed
> >> under the Apache public license to make it easier to merge with Pig.
> >> Currently it has only a very few functions. I implemented a function
> from
> >> different types of functions (e.g., Aggregate and create). I'll keep
> adding
> >> functions and any contributions to the project are welcome. As a
> beginning,
> >> I need an ANT build file that runs the tests, compiles and generates a
> jar
> >> file. I'm not familiar with ANT so any help in this is encouraged.
> >> Here's the project home page
> >> https://github.com/aseldawy/pigeon
> >>
> >>
> >> If you have any comments or suggestion please contact me.
> >>
> >>
> >> Best regards,
> >> Ahmed Eldawy
> >>
> >>
> >> On Mon, May 6, 2013 at 3:09 PM, Jonathan Coveney <jcove...@gmail.com
> >wrote:
> >>
> >>> Nick: the only issue is that the way types are implemented in Pig don't
> >>> allow us to easily "plug-in" types externally. Adding support for that
> >>> would be cool, but a fair bit of work.
> >>>
> >>>
> >>> 2013/5/6 Nick Dimiduk <ndimi...@gmail.com>
> >>>
> >>>> I'm to a lawyer, but I see no reason why this cannot be an external
> >>>> extension to Pig. It would behave the same way PostGIS is an external
> >>>> extension to Postgres. Any Apache issues would be toward general
> >>>> purpose enhancements, not specific to your project.
> >>>>
> >>>> Good on you!
> >>>> -n
> >>>>
> >>>> On Mon, May 6, 2013 at 10:12 AM, Ahmed Eldawy <aseld...@gmail.com>
> >>> wrote:
> >>>>
> >>>>> I contacted solr developers to see how JTS can be included in an
> >>> Apache
> >>>>> project. See
> >>>
> http://mail-archives.apache.org/mod_mbox/lucene-dev/201305.mbox/raw/%3C1367815102914-4060969.post%40n3.nabble.com%3E/
> >>>>> As far as I understand, they did not include it in the main solr
> >>> project,
> >>>>> rather, they created a separate project (spatial 4j) which is still
> >>>>> licensed under Apache license and refers to JTS. Users will have to
> >>>>> download JTS libraries separately to make it run. That's pretty much
> >>> the
> >>>>> same plan that Jonathan mentioned. We will still have the overhead of
> >>>>> serializing/deserializing the shapes each time a function is called.
> >>>> Also,
> >>>>> we will have to use the ugly bytearray data type for spatial data
> >>> instead
> >>>>> of creating its own data type (e.g., Geometry).
> >>>>> I think using spatial 4j instead of JTS will not be sufficient for
> our
> >>>> case
> >>>>> as we need to provide an access to all spatial functions of JTS such
> >>> as
> >>>>> Union, Intersection, Difference, ... etc. This way we can claim
> >>>> conformity
> >>>>> with OGC standards which gives visibility and appreciations of the
> >>>> spatial
> >>>>> community.
> >>>>> I think also that this means I will not add any issues to JIRA as it
> >>> is
> >>>> now
> >>>>> a separate project. I'm planning to host it on github and have all
> the
> >>>>> issues there.
> >>>>> Let me know if you have any suggestions or comments.
> >>>>>
> >>>>> Thanks
> >>>>> Ahmed
> >>>>>
> >>>>>
> >>>>> Best regards,
> >>>>> Ahmed Eldawy
> >>>>>
> >>>>>
> >>>>> On Mon, May 6, 2013 at 9:53 AM, Jonathan Coveney <jcove...@gmail.com
> >
> >>>>> wrote:
> >>>>>
> >>>>>> You can give them all the same label or tag and filter on that later
> >>>> on.
> >>>>>>
> >>>>>>
> >>>>>> 2013/5/6 Ahmed Eldawy <aseld...@gmail.com>
> >>>>>>
> >>>>>>> Thanks all for taking the time to respond. Danial, I didn't know
> >>> that
> >>>>>> Solr
> >>>>>>> uses JTS. This is a good finding and we can definitely ask them to
> >>>> see
> >>>>> if
> >>>>>>> there is a work around we can do. Jonathan, I thought of the same
> >>>> idea
> >>>>> of
> >>>>>>> serializing/deserializing a bytearray each time a UDF is called.
> >>> The
> >>>>>>> deserialization part is good for letting Pig auto detect spatial
> >>>> types
> >>>>> if
> >>>>>>> not set explicitly in the schema. What is the best way to start
> >>>> this? I
> >>>>>>> want to add an initial set of JIRA issues and start working on
> >>> them
> >>>>> but I
> >>>>>>> also need to keep the work grouped in some sense just for
> >>>> organization.
> >>>>>>>
> >>>>>>> Thanks
> >>>>>>> Ahmed
> >>>>>>>
> >>>>>>> Best regards,
> >>>>>>> Ahmed Eldawy
> >>>>>>>
> >>>>>>>
> >>>>>>> On Sat, May 4, 2013 at 4:47 PM, Jonathan Coveney <
> >>> jcove...@gmail.com
> >>>>>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> I agree that this is cool, and if other projects are using JTS
> >>> it
> >>>> is
> >>>>>>> worth
> >>>>>>>> talking them to see how. I also agree that licensing is very
> >>>>>> frustrating.
> >>>>>>>>
> >>>>>>>> In the short term, however, while it is annoying to have to
> >>> manage
> >>>>> the
> >>>>>>>> serialization and deserialization yourself, you can have the
> >>>> geometry
> >>>>>>> type
> >>>>>>>> be passed around as a bytearray type. Your UDF's will have to
> >>> know
> >>>>> this
> >>>>>>> and
> >>>>>>>> treat it accordingly, but if you did this then all of the tools
> >>>> could
> >>>>>> be
> >>>>>>> in
> >>>>>>>> an external project on github instead of a branch in Pig. Then,
> >>> if
> >>>> we
> >>>>>> can
> >>>>>>>> get the licensing done, we could add the Geometry type to Pig.
> >>>> Adding
> >>>>>>>> types, honestly, is kind of tedious but not super difficult, so
> >>>> once
> >>>>>> the
> >>>>>>>> rest is done, that shouldn't be too difficult.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> 2013/5/4 Russell Jurney <russell.jur...@gmail.com>
> >>>>>>>>
> >>>>>>>>> If a way could be found, this would be an awesome addition to
> >>>> Pig.
> >>>>>>>>>
> >>>>>>>>> Russell Jurney http://datasyndrome.com
> >>>>>>>>>
> >>>>>>>>> On May 3, 2013, at 4:09 PM, Daniel Dai <da...@hortonworks.com
> >>>>
> >>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> I am not sure how other Apache projects dealing with it?
> >>> Seems
> >>>>> Solr
> >>>>>>>> also
> >>>>>>>>>> has some connector to JTS?
> >>>>>>>>>>
> >>>>>>>>>> Thanks,
> >>>>>>>>>> Daniel
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> On Thu, May 2, 2013 at 11:59 AM, Ahmed Eldawy <
> >>>>> aseld...@gmail.com>
> >>>>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> Thanks Alan for your interest. It's too bad that an open
> >>>> source
> >>>>>>>>> licensing
> >>>>>>>>>>> issue is holding me back from doing some open source work.
> >>> I
> >>>>>>>> understand
> >>>>>>>>> the
> >>>>>>>>>>> issue and your workarounds make sense. However, as I
> >>> mentioned
> >>>>> in
> >>>>>>> the
> >>>>>>>>>>> beginning, I don't want to have my own branch of Pig
> >>> because
> >>>> it
> >>>>>>> makes
> >>>>>>>> my
> >>>>>>>>>>> extension less portable. I'll think of another way to do
> >>> it.
> >>>>> I'll
> >>>>>>> ask
> >>>>>>>>> vivid
> >>>>>>>>>>> solutions if they can double license their code although I
> >>>> think
> >>>>>> the
> >>>>>>>>> answer
> >>>>>>>>>>> will be no. I'll also think of a way to ship my extension
> >>> as a
> >>>>> set
> >>>>>>> of
> >>>>>>>>> jar
> >>>>>>>>>>> files without the need to change the core of Pig. This
> >>> way, it
> >>>>> can
> >>>>>>> be
> >>>>>>>>>>> easily ported to newer versions of Pig.
> >>>>>>>>>>>
> >>>>>>>>>>> Thanks
> >>>>>>>>>>> Ahmed
> >>>>>>>>>>>
> >>>>>>>>>>> Best regards,
> >>>>>>>>>>> Ahmed Eldawy
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> On Thu, May 2, 2013 at 12:33 PM, Alan Gates <
> >>>>>> ga...@hortonworks.com>
> >>>>>>>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>> I know this is frustrating, but the different licenses do
> >>>> have
> >>>>>>>>> different
> >>>>>>>>>>>> requirements that make it so that Apache can't ship GPL
> >>> code.
> >>>>> A
> >>>>>>>> legal
> >>>>>>>>>>>> explanation is at
> >>>> http://www.apache.org/licenses/GPL-compatibility.htmlForadditional
> >>>>>>> info
> >>>>>>>>> on the LGPL specific questions see
> >>>>>>>>>>>> http://www.apache.org/legal/3party.html
> >>>>>>>>>>>>
> >>>>>>>>>>>> As far as pulling it in via ivy, the issue isn't so much
> >>>> where
> >>>>>> the
> >>>>>>>> code
> >>>>>>>>>>>> lives as much as what code we are requiring to make Pig
> >>> work.
> >>>>> If
> >>>>>>>>>>> something
> >>>>>>>>>>>> that is [L]GPL is required for Pig it violates Apache
> >>> rules
> >>>> as
> >>>>>>>> outlined
> >>>>>>>>>>>> above.  It also would be a show stopper for a lot of
> >>>> companies
> >>>>>> that
> >>>>>>>>>>>> redistribute Pig and that are allergic to GPL software.
> >>>>>>>>>>>>
> >>>>>>>>>>>> So, as I said before, if you wanted to continue with that
> >>>>> library
> >>>>>>> and
> >>>>>>>>>>> they
> >>>>>>>>>>>> are not willing to relicense it then it would have to be
> >>>> bolted
> >>>>>> on
> >>>>>>>>> after
> >>>>>>>>>>>> Apache Pig is built.  Nothing stops you from doing this by
> >>>>>>>> downloading
> >>>>>>>>>>>> Apache Pig, adding this library and your code, and
> >>>>>> redistributing,
> >>>>>>>>> though
> >>>>>>>>>>>> it wouldn't then be open to all Pig users.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Alan.
> >>>>>>>>>>>>
> >>>>>>>>>>>> On May 1, 2013, at 6:08 PM, Ahmed Eldawy wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>>> Thanks for your response. I was never good at
> >>>> differentiating
> >>>>>> all
> >>>>>>>>> those
> >>>>>>>>>>>>> open source licenses. I mean what is the point making
> >>> open
> >>>>>> source
> >>>>>>>>>>>> licenses
> >>>>>>>>>>>>> if it blocks me from using a library in an open source
> >>>>> project.
> >>>>>>> Any
> >>>>>>>>>>> way,
> >>>>>>>>>>>>> I'm not going into debate here. Just one question, if we
> >>> use
> >>>>> JTS
> >>>>>>> as
> >>>>>>>> a
> >>>>>>>>>>>>> library (jar file) without adding the code in Pig, is it
> >>>>> still a
> >>>>>>>>>>>> violation?
> >>>>>>>>>>>>> We'll use ivy, for example, to download the jar file when
> >>>>>>> compiling.
> >>>>>>>>>>>>> On May 1, 2013 7:50 PM, "Alan Gates" <
> >>> ga...@hortonworks.com
> >>>>>
> >>>>>>> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> Passing on the technical details for a moment, I see a
> >>>>>> licensing
> >>>>>>>>>>> issue.
> >>>>>>>>>>>>>> JTS is licensed under LGPL.  Apache projects cannot
> >>> contain
> >>>>> or
> >>>>>>> ship
> >>>>>>>>>>>>>> [L]GPL.  Apache does not meet the requirements of GPL
> >>> and
> >>>>> thus
> >>>>>> we
> >>>>>>>>>>> cannot
> >>>>>>>>>>>>>> repackage their code. If you wanted to go forward using
> >>>> that
> >>>>>>> class
> >>>>>>>>>>> this
> >>>>>>>>>>>>>> would have to be packaged as an add on that was
> >>> downloaded
> >>>>>>>> separately
> >>>>>>>>>>>> and
> >>>>>>>>>>>>>> not from Apache.  Another option is to work with the JTS
> >>>>>>> community
> >>>>>>>>> and
> >>>>>>>>>>>> see
> >>>>>>>>>>>>>> if they are willing to dual license their code under
> >>> BSD or
> >>>>>>> Apache
> >>>>>>>>>>>> license
> >>>>>>>>>>>>>> so that Pig could include it.  If neither of those are
> >>> an
> >>>>>> option
> >>>>>>>> you
> >>>>>>>>>>>> would
> >>>>>>>>>>>>>> need to come up with a new class to contain your spatial
> >>>>> data.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Alan.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> On May 1, 2013, at 5:40 PM, Ahmed Eldawy wrote:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Hi all,
> >>>>>>>>>>>>>>> First, sorry for the long email. I wanted to put all my
> >>>>>> thoughts
> >>>>>>>>> here
> >>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>> get your feedback.
> >>>>>>>>>>>>>>> I'm proposing a major addition to Pig that will greatly
> >>>>>> increase
> >>>>>>>> its
> >>>>>>>>>>>>>>> functionality and user base. It is simply to add
> >>> spatial
> >>>>>> support
> >>>>>>>> to
> >>>>>>>>>>> the
> >>>>>>>>>>>>>>> language and the framework. I've already started
> >>> working
> >>>> on
> >>>>>> that
> >>>>>>>> but
> >>>>>>>>>>> I
> >>>>>>>>>>>>>>> don't want it to be just another branch. I want it,
> >>>>>> eventually,
> >>>>>>> to
> >>>>>>>>> be
> >>>>>>>>>>>>>>> merged with the trunk of Apache Pig. So, I'm sending
> >>> this
> >>>>>> email
> >>>>>>>>>>> mainly
> >>>>>>>>>>>> to
> >>>>>>>>>>>>>>> reach out the main contributors of Pig to see the
> >>>>> feasibility
> >>>>>> of
> >>>>>>>>>>> this.
> >>>>>>>>>>>>>>> This addition is a part of a big project we have been
> >>>>> working
> >>>>>> on
> >>>>>>>> in
> >>>>>>>>>>>>>>> University of Minnesota; the project is called Spatial
> >>>>> Hadoop.
> >>>>>>>>>>>>>>> http://spatialhadoop.cs.umn.edu. It's about building a
> >>>>>>> MapReduce
> >>>>>>>>>>>>>> framework
> >>>>>>>>>>>>>>> (Hadoop) that is capable of maintaining and analyzing
> >>>>> spatial
> >>>>>>> data
> >>>>>>>>>>>>>>> efficiently. I'm the main guy behind that project and
> >>>> since
> >>>>> we
> >>>>>>>>>>> released
> >>>>>>>>>>>>>> its
> >>>>>>>>>>>>>>> first version, we received very encouraging responses
> >>> from
> >>>>>>>> different
> >>>>>>>>>>>>>> groups
> >>>>>>>>>>>>>>> in the research and industrial community. I'm sure the
> >>>>>> addition
> >>>>>>> we
> >>>>>>>>>>> want
> >>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>> make to Pig Latin will be widely accepted by the
> >>> people in
> >>>>> the
> >>>>>>>>>>> spatial
> >>>>>>>>>>>>>>> community.
> >>>>>>>>>>>>>>> I'm proposing a plan here while we're still in the
> >>> early
> >>>>>> phases
> >>>>>>> of
> >>>>>>>>>>> this
> >>>>>>>>>>>>>>> task to be able to discuss it with the main
> >>> contributors
> >>>> and
> >>>>>> see
> >>>>>>>> its
> >>>>>>>>>>>>>>> feasibility. First of all, I think that we need to
> >>> change
> >>>>> the
> >>>>>>> core
> >>>>>>>>> of
> >>>>>>>>>>>> Pig
> >>>>>>>>>>>>>>> to be able to support spatial data. Providing a set of
> >>>> UDFs
> >>>>>> only
> >>>>>>>> is
> >>>>>>>>>>> not
> >>>>>>>>>>>>>>> enough. The main reason is that Pig Latin does not
> >>>> provide a
> >>>>>> way
> >>>>>>>> to
> >>>>>>>>>>>>>> create
> >>>>>>>>>>>>>>> a new data type which is needed for spatial data. Once
> >>> we
> >>>>> have
> >>>>>>> the
> >>>>>>>>>>>>>> spatial
> >>>>>>>>>>>>>>> data types we need, the functionality can be expanded
> >>>> using
> >>>>>> more
> >>>>>>>>>>> UDFs.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Here's the plan as I see it.
> >>>>>>>>>>>>>>> 1- Introduce a new primitive data type Geometry which
> >>>>>> represents
> >>>>>>>> all
> >>>>>>>>>>>>>>> spatial data types. In the underlying system, this will
> >>>> map
> >>>>> to
> >>>>>>>>>>>>>>> com.vividsolutions.jts.geom.Geometry. This is a class
> >>> from
> >>>>>> Java
> >>>>>>>>>>>> Topology
> >>>>>>>>>>>>>>> Suite (JTS) [
> >>>> http://www.vividsolutions.com/jts/JTSHome.htm
> >>>>> ],
> >>>>>> a
> >>>>>>>>>>> stable
> >>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>> efficient open source Java library for spatial data
> >>> types
> >>>>> and
> >>>>>>>>>>>> algorithms.
> >>>>>>>>>>>>>>> It is very popular in the spatial community and a C++
> >>> port
> >>>>> of
> >>>>>> it
> >>>>>>>> is
> >>>>>>>>>>>> used
> >>>>>>>>>>>>>> in
> >>>>>>>>>>>>>>> PostGIS [http://postgis.net/] (a spatial library for
> >>>>>> Postgres).
> >>>>>>>> JTS
> >>>>>>>>>>>> also
> >>>>>>>>>>>>>>> conforms with Open Geospatial Consortium (OGC) [
> >>>>>>>>>>>>>>> http://www.opengeospatial.org/] which is an open
> >>> standard
> >>>>> for
> >>>>>>> the
> >>>>>>>>>>>>>> spatial
> >>>>>>>>>>>>>>> data types. The Geometry data type is read from and
> >>>> written
> >>>>> to
> >>>>>>>> text
> >>>>>>>>>>>> files
> >>>>>>>>>>>>>>> using the Well Known Text (WKT) format. There is also a
> >>>> way
> >>>>> to
> >>>>>>>>>>> convert
> >>>>>>>>>>>> it
> >>>>>>>>>>>>>>> to/from binary so that it can work with binary files
> >>> and
> >>>>>>> streams.
> >>>>>>>>>>>>>>> 2- Add functions that manipulate spatial data types.
> >>> These
> >>>>>> will
> >>>>>>> be
> >>>>>>>>>>>> added
> >>>>>>>>>>>>>> as
> >>>>>>>>>>>>>>> UDFs and we will not need to mess with the internals of
> >>>> Pig.
> >>>>>>> Most
> >>>>>>>>>>>>>> probably,
> >>>>>>>>>>>>>>> there will be one new class for each operation (e.g.,
> >>>> union
> >>>>> or
> >>>>>>>>>>>>>>> intersection). I think it will be good to put these new
> >>>>>>> operations
> >>>>>>>>>>>> inside
> >>>>>>>>>>>>>>> the core of Pig so that users can use it without
> >>> having to
> >>>>>> write
> >>>>>>>> the
> >>>>>>>>>>>>>> fully
> >>>>>>>>>>>>>>> qualified class name. Also, since there is no way to
> >>>>>> implicitly
> >>>>>>>> cast
> >>>>>>>>>>> a
> >>>>>>>>>>>>>>> spatial data type to a non-spatial data types, there
> >>> will
> >>>>> not
> >>>>>> be
> >>>>>>>> any
> >>>>>>>>>>>>>>> conflicts in existing operations or new operations. All
> >>>> new
> >>>>>>>>>>> operations,
> >>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>> only the new operations, will be working on spatial
> >>> data
> >>>>>> types.
> >>>>>>>> Here
> >>>>>>>>>>> is
> >>>>>>>>>>>>>> an
> >>>>>>>>>>>>>>> initial list of operations that can be added. All those
> >>>>>>> operations
> >>>>>>>>>>> are
> >>>>>>>>>>>>>>> already implemented in JTS and the UDFs added to Pig
> >>> will
> >>>> be
> >>>>>>> just
> >>>>>>>>>>>>>> wrappers
> >>>>>>>>>>>>>>> around them.
> >>>>>>>>>>>>>>> **Predicates (used for spatial filtering)
> >>>>>>>>>>>>>>> Equals
> >>>>>>>>>>>>>>> Disjoint
> >>>>>>>>>>>>>>> Intersects
> >>>>>>>>>>>>>>> Touches
> >>>>>>>>>>>>>>> Crosses
> >>>>>>>>>>>>>>> Within
> >>>>>>>>>>>>>>> Contains
> >>>>>>>>>>>>>>> Overlaps
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> **Operations
> >>>>>>>>>>>>>>> Envelope
> >>>>>>>>>>>>>>> Area
> >>>>>>>>>>>>>>> Length
> >>>>>>>>>>>>>>> Buffer
> >>>>>>>>>>>>>>> ConvexHull
> >>>>>>>>>>>>>>> Intersection
> >>>>>>>>>>>>>>> Union
> >>>>>>>>>>>>>>> Difference
> >>>>>>>>>>>>>>> SymDifference
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> **Aggregate functions
> >>>>>>>>>>>>>>> Accum
> >>>>>>>>>>>>>>> ConvexHull
> >>>>>>>>>>>>>>> Union
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> 3- The third step is to implement spatial indexes
> >>> (e.g.,
> >>>>> Grid
> >>>>>> or
> >>>>>>>>>>>>>> R-tree). A
> >>>>>>>>>>>>>>> Pig loader and Pig output classes will be created for
> >>>> those
> >>>>>>>> indexes.
> >>>>>>>>>>>> Note
> >>>>>>>>>>>>>>> that currently we have SpatialOutputFormat and
> >>>>>>> SpatialInputFormat
> >>>>>>>>> for
> >>>>>>>>>>>>>> those
> >>>>>>>>>>>>>>> indexes inside the Spatial Hadoop project, but we need
> >>> to
> >>>>>> tweak
> >>>>>>>> them
> >>>>>>>>>>> to
> >>>>>>>>>>>>>>> work with Pig.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> 4- (Advanced) Implement more sophisticated algorithms
> >>> for
> >>>>>>> spatial
> >>>>>>>>>>>>>>> operations that utilize the indexes. For example, we
> >>> can
> >>>>> have
> >>>>>> a
> >>>>>>>>>>>> specific
> >>>>>>>>>>>>>>> algorithm for spatial range query or spatial join.
> >>> Again,
> >>>> we
> >>>>>>>> already
> >>>>>>>>>>>> have
> >>>>>>>>>>>>>>> algorithms built for different operations implemented
> >>> in
> >>>>>> Spatial
> >>>>>>>>>>> Hadoop
> >>>>>>>>>>>>>> as
> >>>>>>>>>>>>>>> MapReduce programs, but they will need to be modified
> >>> to
> >>>>> work
> >>>>>> in
> >>>>>>>> Pig
> >>>>>>>>>>>>>>> environment and get to work with other operations.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> This is my whole plan for the spatial extension to Pig.
> >>>> I've
> >>>>>>>> already
> >>>>>>>>>>>>>>> started with the first step but as I mentioned
> >>> earlier, I
> >>>>>> don't
> >>>>>>>> want
> >>>>>>>>>>> to
> >>>>>>>>>>>>>> do
> >>>>>>>>>>>>>>> the work for our project and then the work gets
> >>>> forgotten. I
> >>>>>>> want
> >>>>>>>> to
> >>>>>>>>>>>>>>> contribute to Pig and do my research at the same time.
> >>> If
> >>>>> you
> >>>>>>>> think
> >>>>>>>>>>> the
> >>>>>>>>>>>>>>> plan is plausible, I'll open JIRA issues for the above
> >>>> tasks
> >>>>>> and
> >>>>>>>>>>> start
> >>>>>>>>>>>>>>> shipping patches to do the stuff. I'll conform with the
> >>>>>>> standards
> >>>>>>>> of
> >>>>>>>>>>>> the
> >>>>>>>>>>>>>>> project such as adding tests and well commenting the
> >>> code.
> >>>>>>>>>>>>>>> Sorry for the long email and hope to hear back from
> >>> you.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Best regards,
> >>>>>>>>>>>>>>> Ahmed Eldawy
> >>
> >>
>

Reply via email to