Those JIRAs do best that are completed by one person driving them.

On Mon, Jun 3, 2013 at 10:26 AM, Ahmed Eldawy <aseld...@gmail.com> wrote:

> I've just created a new JIRA issue for the spatial functionality.
> https://issues.apache.org/jira/browse/PIG-3344
> This issue is all about the new datatype which is the only thing that needs
> to be changed internally in Pig in this phase. Pigeon is already working
> with the ESRI library but it converts between binary representation and
> Geometry class back and forth. Once the new datatype is added, we can
> change Pigeon to work with this datatype too. We can still keep the current
> conversion functionality as it allows the system to automatically perform
> the conversion from the bytearray datatype as it adds the autodetect
> functionality when a column is not given a type in the schema.
>
> I don't know if I should provide a patch to this issue myself or there is
> someone else who can work on it. I can of course do it but I think it will
> take me some time to finish as I'm not yet familiar with the internals of
> Pig. Someone who is familiar with the parser would definitely make a better
> job here. I can focus on Pigeon and add more spatial functions there so
> that we can have a plenty of functions once the new datatype is added. I'm
> open to both solutions but I'm just checking with you.
>
> Thanks
> Ahmed
>
> Best regards,
> Ahmed Eldawy
>
>
> On Wed, May 29, 2013 at 12:17 PM, Russell Jurney
> <russell.jur...@gmail.com>wrote:
>
> > Awesome. This would be a great addition to Pig. Please create a JIRA.
> >
> > Russell Jurney http://datasyndrome.com
> >
> > On May 29, 2013, at 8:51 AM, Ahmed Eldawy <aseld...@gmail.com> wrote:
> >
> > > Hi all,
> > >
> > > Nick has pointed out to me an alternative GIS package that can replace
> > JTS.
> > > ESRI has recently released a GIS
> > > package<https://github.com/Esri/geometry-api-java>under Apache
> > > license. I changed Pigeon to work with that new package. I
> > > think it could be easier now to integrate this work with main branch of
> > > Apache Pig. I will go on with the current project and add more spatial
> > > functionality. We can then add a new datatype to Apache and link it to
> > > those functions.
> > >
> > > ESRI package contains a class OGCGeometry
> > > <
> >
> http://esri.github.io/geometry-api-java/javadoc/com/esri/core/geometry/ogc/OGCGeometry.html
> > >which
> > > can be linked to a new datatype 'Geometry'. Do you think we can rely on
> > the
> > > new package and integrate the work with Apache Pig?
> > >
> > > On May 23, 2013 11:40 PM, "Ahmed Eldawy" <aseld...@gmail.com> wrote:
> > >
> > >> Hi all,
> > >>  Thanks for your help. I've started the project with a minimal
> > >> functionality as a start. It's currently hosted in github. It is
> > licensed
> > >> under the Apache public license to make it easier to merge with Pig.
> > >> Currently it has only a very few functions. I implemented a function
> > from
> > >> different types of functions (e.g., Aggregate and create). I'll keep
> > adding
> > >> functions and any contributions to the project are welcome. As a
> > beginning,
> > >> I need an ANT build file that runs the tests, compiles and generates a
> > jar
> > >> file. I'm not familiar with ANT so any help in this is encouraged.
> > >> Here's the project home page
> > >> https://github.com/aseldawy/pigeon
> > >>
> > >>
> > >> If you have any comments or suggestion please contact me.
> > >>
> > >>
> > >> Best regards,
> > >> Ahmed Eldawy
> > >>
> > >>
> > >> On Mon, May 6, 2013 at 3:09 PM, Jonathan Coveney <jcove...@gmail.com
> > >wrote:
> > >>
> > >>> Nick: the only issue is that the way types are implemented in Pig
> don't
> > >>> allow us to easily "plug-in" types externally. Adding support for
> that
> > >>> would be cool, but a fair bit of work.
> > >>>
> > >>>
> > >>> 2013/5/6 Nick Dimiduk <ndimi...@gmail.com>
> > >>>
> > >>>> I'm to a lawyer, but I see no reason why this cannot be an external
> > >>>> extension to Pig. It would behave the same way PostGIS is an
> external
> > >>>> extension to Postgres. Any Apache issues would be toward general
> > >>>> purpose enhancements, not specific to your project.
> > >>>>
> > >>>> Good on you!
> > >>>> -n
> > >>>>
> > >>>> On Mon, May 6, 2013 at 10:12 AM, Ahmed Eldawy <aseld...@gmail.com>
> > >>> wrote:
> > >>>>
> > >>>>> I contacted solr developers to see how JTS can be included in an
> > >>> Apache
> > >>>>> project. See
> > >>>
> >
> http://mail-archives.apache.org/mod_mbox/lucene-dev/201305.mbox/raw/%3C1367815102914-4060969.post%40n3.nabble.com%3E/
> > >>>>> As far as I understand, they did not include it in the main solr
> > >>> project,
> > >>>>> rather, they created a separate project (spatial 4j) which is still
> > >>>>> licensed under Apache license and refers to JTS. Users will have to
> > >>>>> download JTS libraries separately to make it run. That's pretty
> much
> > >>> the
> > >>>>> same plan that Jonathan mentioned. We will still have the overhead
> of
> > >>>>> serializing/deserializing the shapes each time a function is
> called.
> > >>>> Also,
> > >>>>> we will have to use the ugly bytearray data type for spatial data
> > >>> instead
> > >>>>> of creating its own data type (e.g., Geometry).
> > >>>>> I think using spatial 4j instead of JTS will not be sufficient for
> > our
> > >>>> case
> > >>>>> as we need to provide an access to all spatial functions of JTS
> such
> > >>> as
> > >>>>> Union, Intersection, Difference, ... etc. This way we can claim
> > >>>> conformity
> > >>>>> with OGC standards which gives visibility and appreciations of the
> > >>>> spatial
> > >>>>> community.
> > >>>>> I think also that this means I will not add any issues to JIRA as
> it
> > >>> is
> > >>>> now
> > >>>>> a separate project. I'm planning to host it on github and have all
> > the
> > >>>>> issues there.
> > >>>>> Let me know if you have any suggestions or comments.
> > >>>>>
> > >>>>> Thanks
> > >>>>> Ahmed
> > >>>>>
> > >>>>>
> > >>>>> Best regards,
> > >>>>> Ahmed Eldawy
> > >>>>>
> > >>>>>
> > >>>>> On Mon, May 6, 2013 at 9:53 AM, Jonathan Coveney <
> jcove...@gmail.com
> > >
> > >>>>> wrote:
> > >>>>>
> > >>>>>> You can give them all the same label or tag and filter on that
> later
> > >>>> on.
> > >>>>>>
> > >>>>>>
> > >>>>>> 2013/5/6 Ahmed Eldawy <aseld...@gmail.com>
> > >>>>>>
> > >>>>>>> Thanks all for taking the time to respond. Danial, I didn't know
> > >>> that
> > >>>>>> Solr
> > >>>>>>> uses JTS. This is a good finding and we can definitely ask them
> to
> > >>>> see
> > >>>>> if
> > >>>>>>> there is a work around we can do. Jonathan, I thought of the same
> > >>>> idea
> > >>>>> of
> > >>>>>>> serializing/deserializing a bytearray each time a UDF is called.
> > >>> The
> > >>>>>>> deserialization part is good for letting Pig auto detect spatial
> > >>>> types
> > >>>>> if
> > >>>>>>> not set explicitly in the schema. What is the best way to start
> > >>>> this? I
> > >>>>>>> want to add an initial set of JIRA issues and start working on
> > >>> them
> > >>>>> but I
> > >>>>>>> also need to keep the work grouped in some sense just for
> > >>>> organization.
> > >>>>>>>
> > >>>>>>> Thanks
> > >>>>>>> Ahmed
> > >>>>>>>
> > >>>>>>> Best regards,
> > >>>>>>> Ahmed Eldawy
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> On Sat, May 4, 2013 at 4:47 PM, Jonathan Coveney <
> > >>> jcove...@gmail.com
> > >>>>>
> > >>>>>>> wrote:
> > >>>>>>>
> > >>>>>>>> I agree that this is cool, and if other projects are using JTS
> > >>> it
> > >>>> is
> > >>>>>>> worth
> > >>>>>>>> talking them to see how. I also agree that licensing is very
> > >>>>>> frustrating.
> > >>>>>>>>
> > >>>>>>>> In the short term, however, while it is annoying to have to
> > >>> manage
> > >>>>> the
> > >>>>>>>> serialization and deserialization yourself, you can have the
> > >>>> geometry
> > >>>>>>> type
> > >>>>>>>> be passed around as a bytearray type. Your UDF's will have to
> > >>> know
> > >>>>> this
> > >>>>>>> and
> > >>>>>>>> treat it accordingly, but if you did this then all of the tools
> > >>>> could
> > >>>>>> be
> > >>>>>>> in
> > >>>>>>>> an external project on github instead of a branch in Pig. Then,
> > >>> if
> > >>>> we
> > >>>>>> can
> > >>>>>>>> get the licensing done, we could add the Geometry type to Pig.
> > >>>> Adding
> > >>>>>>>> types, honestly, is kind of tedious but not super difficult, so
> > >>>> once
> > >>>>>> the
> > >>>>>>>> rest is done, that shouldn't be too difficult.
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> 2013/5/4 Russell Jurney <russell.jur...@gmail.com>
> > >>>>>>>>
> > >>>>>>>>> If a way could be found, this would be an awesome addition to
> > >>>> Pig.
> > >>>>>>>>>
> > >>>>>>>>> Russell Jurney http://datasyndrome.com
> > >>>>>>>>>
> > >>>>>>>>> On May 3, 2013, at 4:09 PM, Daniel Dai <da...@hortonworks.com
> > >>>>
> > >>>>>> wrote:
> > >>>>>>>>>
> > >>>>>>>>>> I am not sure how other Apache projects dealing with it?
> > >>> Seems
> > >>>>> Solr
> > >>>>>>>> also
> > >>>>>>>>>> has some connector to JTS?
> > >>>>>>>>>>
> > >>>>>>>>>> Thanks,
> > >>>>>>>>>> Daniel
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>> On Thu, May 2, 2013 at 11:59 AM, Ahmed Eldawy <
> > >>>>> aseld...@gmail.com>
> > >>>>>>>>> wrote:
> > >>>>>>>>>>
> > >>>>>>>>>>> Thanks Alan for your interest. It's too bad that an open
> > >>>> source
> > >>>>>>>>> licensing
> > >>>>>>>>>>> issue is holding me back from doing some open source work.
> > >>> I
> > >>>>>>>> understand
> > >>>>>>>>> the
> > >>>>>>>>>>> issue and your workarounds make sense. However, as I
> > >>> mentioned
> > >>>>> in
> > >>>>>>> the
> > >>>>>>>>>>> beginning, I don't want to have my own branch of Pig
> > >>> because
> > >>>> it
> > >>>>>>> makes
> > >>>>>>>> my
> > >>>>>>>>>>> extension less portable. I'll think of another way to do
> > >>> it.
> > >>>>> I'll
> > >>>>>>> ask
> > >>>>>>>>> vivid
> > >>>>>>>>>>> solutions if they can double license their code although I
> > >>>> think
> > >>>>>> the
> > >>>>>>>>> answer
> > >>>>>>>>>>> will be no. I'll also think of a way to ship my extension
> > >>> as a
> > >>>>> set
> > >>>>>>> of
> > >>>>>>>>> jar
> > >>>>>>>>>>> files without the need to change the core of Pig. This
> > >>> way, it
> > >>>>> can
> > >>>>>>> be
> > >>>>>>>>>>> easily ported to newer versions of Pig.
> > >>>>>>>>>>>
> > >>>>>>>>>>> Thanks
> > >>>>>>>>>>> Ahmed
> > >>>>>>>>>>>
> > >>>>>>>>>>> Best regards,
> > >>>>>>>>>>> Ahmed Eldawy
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>> On Thu, May 2, 2013 at 12:33 PM, Alan Gates <
> > >>>>>> ga...@hortonworks.com>
> > >>>>>>>>> wrote:
> > >>>>>>>>>>>
> > >>>>>>>>>>>> I know this is frustrating, but the different licenses do
> > >>>> have
> > >>>>>>>>> different
> > >>>>>>>>>>>> requirements that make it so that Apache can't ship GPL
> > >>> code.
> > >>>>> A
> > >>>>>>>> legal
> > >>>>>>>>>>>> explanation is at
> > >>>> http://www.apache.org/licenses/GPL-compatibility.htmlForadditional
> > >>>>>>> info
> > >>>>>>>>> on the LGPL specific questions see
> > >>>>>>>>>>>> http://www.apache.org/legal/3party.html
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> As far as pulling it in via ivy, the issue isn't so much
> > >>>> where
> > >>>>>> the
> > >>>>>>>> code
> > >>>>>>>>>>>> lives as much as what code we are requiring to make Pig
> > >>> work.
> > >>>>> If
> > >>>>>>>>>>> something
> > >>>>>>>>>>>> that is [L]GPL is required for Pig it violates Apache
> > >>> rules
> > >>>> as
> > >>>>>>>> outlined
> > >>>>>>>>>>>> above.  It also would be a show stopper for a lot of
> > >>>> companies
> > >>>>>> that
> > >>>>>>>>>>>> redistribute Pig and that are allergic to GPL software.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> So, as I said before, if you wanted to continue with that
> > >>>>> library
> > >>>>>>> and
> > >>>>>>>>>>> they
> > >>>>>>>>>>>> are not willing to relicense it then it would have to be
> > >>>> bolted
> > >>>>>> on
> > >>>>>>>>> after
> > >>>>>>>>>>>> Apache Pig is built.  Nothing stops you from doing this by
> > >>>>>>>> downloading
> > >>>>>>>>>>>> Apache Pig, adding this library and your code, and
> > >>>>>> redistributing,
> > >>>>>>>>> though
> > >>>>>>>>>>>> it wouldn't then be open to all Pig users.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Alan.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> On May 1, 2013, at 6:08 PM, Ahmed Eldawy wrote:
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>> Thanks for your response. I was never good at
> > >>>> differentiating
> > >>>>>> all
> > >>>>>>>>> those
> > >>>>>>>>>>>>> open source licenses. I mean what is the point making
> > >>> open
> > >>>>>> source
> > >>>>>>>>>>>> licenses
> > >>>>>>>>>>>>> if it blocks me from using a library in an open source
> > >>>>> project.
> > >>>>>>> Any
> > >>>>>>>>>>> way,
> > >>>>>>>>>>>>> I'm not going into debate here. Just one question, if we
> > >>> use
> > >>>>> JTS
> > >>>>>>> as
> > >>>>>>>> a
> > >>>>>>>>>>>>> library (jar file) without adding the code in Pig, is it
> > >>>>> still a
> > >>>>>>>>>>>> violation?
> > >>>>>>>>>>>>> We'll use ivy, for example, to download the jar file when
> > >>>>>>> compiling.
> > >>>>>>>>>>>>> On May 1, 2013 7:50 PM, "Alan Gates" <
> > >>> ga...@hortonworks.com
> > >>>>>
> > >>>>>>> wrote:
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Passing on the technical details for a moment, I see a
> > >>>>>> licensing
> > >>>>>>>>>>> issue.
> > >>>>>>>>>>>>>> JTS is licensed under LGPL.  Apache projects cannot
> > >>> contain
> > >>>>> or
> > >>>>>>> ship
> > >>>>>>>>>>>>>> [L]GPL.  Apache does not meet the requirements of GPL
> > >>> and
> > >>>>> thus
> > >>>>>> we
> > >>>>>>>>>>> cannot
> > >>>>>>>>>>>>>> repackage their code. If you wanted to go forward using
> > >>>> that
> > >>>>>>> class
> > >>>>>>>>>>> this
> > >>>>>>>>>>>>>> would have to be packaged as an add on that was
> > >>> downloaded
> > >>>>>>>> separately
> > >>>>>>>>>>>> and
> > >>>>>>>>>>>>>> not from Apache.  Another option is to work with the JTS
> > >>>>>>> community
> > >>>>>>>>> and
> > >>>>>>>>>>>> see
> > >>>>>>>>>>>>>> if they are willing to dual license their code under
> > >>> BSD or
> > >>>>>>> Apache
> > >>>>>>>>>>>> license
> > >>>>>>>>>>>>>> so that Pig could include it.  If neither of those are
> > >>> an
> > >>>>>> option
> > >>>>>>>> you
> > >>>>>>>>>>>> would
> > >>>>>>>>>>>>>> need to come up with a new class to contain your spatial
> > >>>>> data.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Alan.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> On May 1, 2013, at 5:40 PM, Ahmed Eldawy wrote:
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Hi all,
> > >>>>>>>>>>>>>>> First, sorry for the long email. I wanted to put all my
> > >>>>>> thoughts
> > >>>>>>>>> here
> > >>>>>>>>>>>>>> and
> > >>>>>>>>>>>>>>> get your feedback.
> > >>>>>>>>>>>>>>> I'm proposing a major addition to Pig that will greatly
> > >>>>>> increase
> > >>>>>>>> its
> > >>>>>>>>>>>>>>> functionality and user base. It is simply to add
> > >>> spatial
> > >>>>>> support
> > >>>>>>>> to
> > >>>>>>>>>>> the
> > >>>>>>>>>>>>>>> language and the framework. I've already started
> > >>> working
> > >>>> on
> > >>>>>> that
> > >>>>>>>> but
> > >>>>>>>>>>> I
> > >>>>>>>>>>>>>>> don't want it to be just another branch. I want it,
> > >>>>>> eventually,
> > >>>>>>> to
> > >>>>>>>>> be
> > >>>>>>>>>>>>>>> merged with the trunk of Apache Pig. So, I'm sending
> > >>> this
> > >>>>>> email
> > >>>>>>>>>>> mainly
> > >>>>>>>>>>>> to
> > >>>>>>>>>>>>>>> reach out the main contributors of Pig to see the
> > >>>>> feasibility
> > >>>>>> of
> > >>>>>>>>>>> this.
> > >>>>>>>>>>>>>>> This addition is a part of a big project we have been
> > >>>>> working
> > >>>>>> on
> > >>>>>>>> in
> > >>>>>>>>>>>>>>> University of Minnesota; the project is called Spatial
> > >>>>> Hadoop.
> > >>>>>>>>>>>>>>> http://spatialhadoop.cs.umn.edu. It's about building a
> > >>>>>>> MapReduce
> > >>>>>>>>>>>>>> framework
> > >>>>>>>>>>>>>>> (Hadoop) that is capable of maintaining and analyzing
> > >>>>> spatial
> > >>>>>>> data
> > >>>>>>>>>>>>>>> efficiently. I'm the main guy behind that project and
> > >>>> since
> > >>>>> we
> > >>>>>>>>>>> released
> > >>>>>>>>>>>>>> its
> > >>>>>>>>>>>>>>> first version, we received very encouraging responses
> > >>> from
> > >>>>>>>> different
> > >>>>>>>>>>>>>> groups
> > >>>>>>>>>>>>>>> in the research and industrial community. I'm sure the
> > >>>>>> addition
> > >>>>>>> we
> > >>>>>>>>>>> want
> > >>>>>>>>>>>>>> to
> > >>>>>>>>>>>>>>> make to Pig Latin will be widely accepted by the
> > >>> people in
> > >>>>> the
> > >>>>>>>>>>> spatial
> > >>>>>>>>>>>>>>> community.
> > >>>>>>>>>>>>>>> I'm proposing a plan here while we're still in the
> > >>> early
> > >>>>>> phases
> > >>>>>>> of
> > >>>>>>>>>>> this
> > >>>>>>>>>>>>>>> task to be able to discuss it with the main
> > >>> contributors
> > >>>> and
> > >>>>>> see
> > >>>>>>>> its
> > >>>>>>>>>>>>>>> feasibility. First of all, I think that we need to
> > >>> change
> > >>>>> the
> > >>>>>>> core
> > >>>>>>>>> of
> > >>>>>>>>>>>> Pig
> > >>>>>>>>>>>>>>> to be able to support spatial data. Providing a set of
> > >>>> UDFs
> > >>>>>> only
> > >>>>>>>> is
> > >>>>>>>>>>> not
> > >>>>>>>>>>>>>>> enough. The main reason is that Pig Latin does not
> > >>>> provide a
> > >>>>>> way
> > >>>>>>>> to
> > >>>>>>>>>>>>>> create
> > >>>>>>>>>>>>>>> a new data type which is needed for spatial data. Once
> > >>> we
> > >>>>> have
> > >>>>>>> the
> > >>>>>>>>>>>>>> spatial
> > >>>>>>>>>>>>>>> data types we need, the functionality can be expanded
> > >>>> using
> > >>>>>> more
> > >>>>>>>>>>> UDFs.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Here's the plan as I see it.
> > >>>>>>>>>>>>>>> 1- Introduce a new primitive data type Geometry which
> > >>>>>> represents
> > >>>>>>>> all
> > >>>>>>>>>>>>>>> spatial data types. In the underlying system, this will
> > >>>> map
> > >>>>> to
> > >>>>>>>>>>>>>>> com.vividsolutions.jts.geom.Geometry. This is a class
> > >>> from
> > >>>>>> Java
> > >>>>>>>>>>>> Topology
> > >>>>>>>>>>>>>>> Suite (JTS) [
> > >>>> http://www.vividsolutions.com/jts/JTSHome.htm
> > >>>>> ],
> > >>>>>> a
> > >>>>>>>>>>> stable
> > >>>>>>>>>>>>>> and
> > >>>>>>>>>>>>>>> efficient open source Java library for spatial data
> > >>> types
> > >>>>> and
> > >>>>>>>>>>>> algorithms.
> > >>>>>>>>>>>>>>> It is very popular in the spatial community and a C++
> > >>> port
> > >>>>> of
> > >>>>>> it
> > >>>>>>>> is
> > >>>>>>>>>>>> used
> > >>>>>>>>>>>>>> in
> > >>>>>>>>>>>>>>> PostGIS [http://postgis.net/] (a spatial library for
> > >>>>>> Postgres).
> > >>>>>>>> JTS
> > >>>>>>>>>>>> also
> > >>>>>>>>>>>>>>> conforms with Open Geospatial Consortium (OGC) [
> > >>>>>>>>>>>>>>> http://www.opengeospatial.org/] which is an open
> > >>> standard
> > >>>>> for
> > >>>>>>> the
> > >>>>>>>>>>>>>> spatial
> > >>>>>>>>>>>>>>> data types. The Geometry data type is read from and
> > >>>> written
> > >>>>> to
> > >>>>>>>> text
> > >>>>>>>>>>>> files
> > >>>>>>>>>>>>>>> using the Well Known Text (WKT) format. There is also a
> > >>>> way
> > >>>>> to
> > >>>>>>>>>>> convert
> > >>>>>>>>>>>> it
> > >>>>>>>>>>>>>>> to/from binary so that it can work with binary files
> > >>> and
> > >>>>>>> streams.
> > >>>>>>>>>>>>>>> 2- Add functions that manipulate spatial data types.
> > >>> These
> > >>>>>> will
> > >>>>>>> be
> > >>>>>>>>>>>> added
> > >>>>>>>>>>>>>> as
> > >>>>>>>>>>>>>>> UDFs and we will not need to mess with the internals of
> > >>>> Pig.
> > >>>>>>> Most
> > >>>>>>>>>>>>>> probably,
> > >>>>>>>>>>>>>>> there will be one new class for each operation (e.g.,
> > >>>> union
> > >>>>> or
> > >>>>>>>>>>>>>>> intersection). I think it will be good to put these new
> > >>>>>>> operations
> > >>>>>>>>>>>> inside
> > >>>>>>>>>>>>>>> the core of Pig so that users can use it without
> > >>> having to
> > >>>>>> write
> > >>>>>>>> the
> > >>>>>>>>>>>>>> fully
> > >>>>>>>>>>>>>>> qualified class name. Also, since there is no way to
> > >>>>>> implicitly
> > >>>>>>>> cast
> > >>>>>>>>>>> a
> > >>>>>>>>>>>>>>> spatial data type to a non-spatial data types, there
> > >>> will
> > >>>>> not
> > >>>>>> be
> > >>>>>>>> any
> > >>>>>>>>>>>>>>> conflicts in existing operations or new operations. All
> > >>>> new
> > >>>>>>>>>>> operations,
> > >>>>>>>>>>>>>> and
> > >>>>>>>>>>>>>>> only the new operations, will be working on spatial
> > >>> data
> > >>>>>> types.
> > >>>>>>>> Here
> > >>>>>>>>>>> is
> > >>>>>>>>>>>>>> an
> > >>>>>>>>>>>>>>> initial list of operations that can be added. All those
> > >>>>>>> operations
> > >>>>>>>>>>> are
> > >>>>>>>>>>>>>>> already implemented in JTS and the UDFs added to Pig
> > >>> will
> > >>>> be
> > >>>>>>> just
> > >>>>>>>>>>>>>> wrappers
> > >>>>>>>>>>>>>>> around them.
> > >>>>>>>>>>>>>>> **Predicates (used for spatial filtering)
> > >>>>>>>>>>>>>>> Equals
> > >>>>>>>>>>>>>>> Disjoint
> > >>>>>>>>>>>>>>> Intersects
> > >>>>>>>>>>>>>>> Touches
> > >>>>>>>>>>>>>>> Crosses
> > >>>>>>>>>>>>>>> Within
> > >>>>>>>>>>>>>>> Contains
> > >>>>>>>>>>>>>>> Overlaps
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> **Operations
> > >>>>>>>>>>>>>>> Envelope
> > >>>>>>>>>>>>>>> Area
> > >>>>>>>>>>>>>>> Length
> > >>>>>>>>>>>>>>> Buffer
> > >>>>>>>>>>>>>>> ConvexHull
> > >>>>>>>>>>>>>>> Intersection
> > >>>>>>>>>>>>>>> Union
> > >>>>>>>>>>>>>>> Difference
> > >>>>>>>>>>>>>>> SymDifference
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> **Aggregate functions
> > >>>>>>>>>>>>>>> Accum
> > >>>>>>>>>>>>>>> ConvexHull
> > >>>>>>>>>>>>>>> Union
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> 3- The third step is to implement spatial indexes
> > >>> (e.g.,
> > >>>>> Grid
> > >>>>>> or
> > >>>>>>>>>>>>>> R-tree). A
> > >>>>>>>>>>>>>>> Pig loader and Pig output classes will be created for
> > >>>> those
> > >>>>>>>> indexes.
> > >>>>>>>>>>>> Note
> > >>>>>>>>>>>>>>> that currently we have SpatialOutputFormat and
> > >>>>>>> SpatialInputFormat
> > >>>>>>>>> for
> > >>>>>>>>>>>>>> those
> > >>>>>>>>>>>>>>> indexes inside the Spatial Hadoop project, but we need
> > >>> to
> > >>>>>> tweak
> > >>>>>>>> them
> > >>>>>>>>>>> to
> > >>>>>>>>>>>>>>> work with Pig.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> 4- (Advanced) Implement more sophisticated algorithms
> > >>> for
> > >>>>>>> spatial
> > >>>>>>>>>>>>>>> operations that utilize the indexes. For example, we
> > >>> can
> > >>>>> have
> > >>>>>> a
> > >>>>>>>>>>>> specific
> > >>>>>>>>>>>>>>> algorithm for spatial range query or spatial join.
> > >>> Again,
> > >>>> we
> > >>>>>>>> already
> > >>>>>>>>>>>> have
> > >>>>>>>>>>>>>>> algorithms built for different operations implemented
> > >>> in
> > >>>>>> Spatial
> > >>>>>>>>>>> Hadoop
> > >>>>>>>>>>>>>> as
> > >>>>>>>>>>>>>>> MapReduce programs, but they will need to be modified
> > >>> to
> > >>>>> work
> > >>>>>> in
> > >>>>>>>> Pig
> > >>>>>>>>>>>>>>> environment and get to work with other operations.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> This is my whole plan for the spatial extension to Pig.
> > >>>> I've
> > >>>>>>>> already
> > >>>>>>>>>>>>>>> started with the first step but as I mentioned
> > >>> earlier, I
> > >>>>>> don't
> > >>>>>>>> want
> > >>>>>>>>>>> to
> > >>>>>>>>>>>>>> do
> > >>>>>>>>>>>>>>> the work for our project and then the work gets
> > >>>> forgotten. I
> > >>>>>>> want
> > >>>>>>>> to
> > >>>>>>>>>>>>>>> contribute to Pig and do my research at the same time.
> > >>> If
> > >>>>> you
> > >>>>>>>> think
> > >>>>>>>>>>> the
> > >>>>>>>>>>>>>>> plan is plausible, I'll open JIRA issues for the above
> > >>>> tasks
> > >>>>>> and
> > >>>>>>>>>>> start
> > >>>>>>>>>>>>>>> shipping patches to do the stuff. I'll conform with the
> > >>>>>>> standards
> > >>>>>>>> of
> > >>>>>>>>>>>> the
> > >>>>>>>>>>>>>>> project such as adding tests and well commenting the
> > >>> code.
> > >>>>>>>>>>>>>>> Sorry for the long email and hope to hear back from
> > >>> you.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Best regards,
> > >>>>>>>>>>>>>>> Ahmed Eldawy
> > >>
> > >>
> >
>



-- 
Russell Jurney twitter.com/rjurney russell.jur...@gmail.com datasyndrome.com

Reply via email to