[GitHub] drill issue #258: DRILL-4091: Support for additional gis operations in gis c...

2017-10-27 Thread k255
Github user k255 commented on the issue:

https://github.com/apache/drill/pull/258
  
It's good that now there's commiter which is aware of the GIS context! The 
list of functions added in this PR is as follows:
ST_Buffer, ST_Contains, ST_Crosses, ST_Difference, ST_Disjoint, 
ST_DiST_ance, ST_Envelope, ST_Equals, ST_Intersects, ST_Overlaps, ST_Relate, 
ST_Touches, ST_Transform, ST_Union, ST_UnionAggregate, ST_X, ST_Y, ST_XMin, 
ST_XMax, ST_YMin, ST_YMax
 
Regarding the documentation, I wouldn't like to duplicate it because I 
followed what is available in PostGIS (which actually uses GEOS lib, in similar 
way as drill-gis uses relevant java libs - esri, proj4j) and these are defined 
in open geospatial consortium (OGC) specs. Of course here we have just a subset 
of what PostGIS is capable of, but I think it's valuable subset.
So i.e. for ST_X function the docs are at 
http://www.postgis.net/docs/ST_X.html
Also on example usage please refer to examples contained in readme at:
    https://github.com/k255/drill-gis

I'll also finally need to think about blog post/presentation on this 
extension, but most probably not in following days but later in the future.




---


[GitHub] drill issue #258: DRILL-4091: Support for additional gis operations in gis c...

2017-10-27 Thread k255
Github user k255 commented on the issue:

https://github.com/apache/drill/pull/258
  
@amansinha100 better later than never! The PR is updated now.
@cgivre offered that he could help reviewing this.
@joeauty probably in further development we can consider adding geojson 
support. I'm happy that you like it!


---


[GitHub] drill pull request: DRILL-4091: Support for additional gis operati...

2016-02-11 Thread k255
Github user k255 commented on the pull request:

https://github.com/apache/drill/pull/258#issuecomment-182857803
  
Aggregate version of st_union allows merging geometries using 'group by'


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-4303: ESRI Shapefile (shp) format plugin

2016-01-22 Thread k255
GitHub user k255 opened a pull request:

https://github.com/apache/drill/pull/335

DRILL-4303: ESRI Shapefile (shp) format plugin

Shp format plugin. Main idea is to read shapefiles for joining with other 
sources or enabling the conversion to i.e. parquet file which is capable of 
storing geometry data in binary format (WKT) on hdfs.
The implementation is based on esri java lib which lets to parse single 
geometry definition. Custom code is written to read whole file 
(ShapefileByteBufferCursor). The plugin also handles reading of accompanying 
data file (dbf) and srid informations (srid). 
Sample usage:
- reading shp
```select *, ST_AsText(geom) from cp.`sample-data/CA-cities.shp`;```

- conversion to parquet
```alter session set `store.format`='parquet';```
```create table dfs.tmp.`/CA-cities-par` as select * from 
cp.`sample-data/CA-cities.shp`;```

There is also sample parquet file in cp.`sample-data/CA-cities.parquet`

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/k255/drill drill-gis-shp

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/335.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #335


commit ecaa6ff5303cd179cc0c0f96518b1ee69ff40955
Author: potocki 
Date:   2016-01-22T11:21:04Z

ESRI Shapefile (shp) reader implemented as drill format plugin

commit 91ccd1ccf0d06802dcf0da2ee1ef83c903c248af
Author: potocki 
Date:   2016-01-22T12:19:00Z

added sample file in parquet format




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-4091: Support for additional gis operati...

2016-01-19 Thread k255
Github user k255 commented on the pull request:

https://github.com/apache/drill/pull/258#issuecomment-172858385
  
Added new functionality to transform spatial reference of geometries (SRID) 
based on Proj4J.
Usage: ST_Transform(geom, srcSRID, tgtSRID)

This lets you transform SRID in drill without using external tools!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-4091: Support for additional gis operati...

2015-11-16 Thread k255
Github user k255 commented on the pull request:

https://github.com/apache/drill/pull/258#issuecomment-157026760
  
this extends DRILL-3914 functionality


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-4091: Support for additional gis operati...

2015-11-16 Thread k255
GitHub user k255 opened a pull request:

https://github.com/apache/drill/pull/258

DRILL-4091: Support for additional gis operations in gis contrib module

Support for commonly used gis functions in gis contrib module: relate, 
contains, crosses, intersects, touches, difference, disjoint, equals, overlaps, 
buffer, union, get x coord. of a point, get y coord of a point.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/k255/drill drill-gis-ext

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/258.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #258


commit 6081158304ab646b1c022f4ec047df0f6cdc5d1c
Author: potocki 
Date:   2015-11-16T13:05:18Z

Support for additional gis operations (relate, contains, touches, union, 
get x y of a point and more)




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-3747: basic similarity search with simme...

2015-10-30 Thread k255
GitHub user k255 opened a pull request:

https://github.com/apache/drill/pull/224

DRILL-3747: basic similarity search with simmetric

Helps handling i.e. typos in search queries with popular algorithms like 
levenshtein.
Sample query:
```
select levenshtein('foo', 'boo') from (VALUES(1)); //gives 0.67
```
and
```
select levenshtein('foo', 'bar') from (VALUES(1)); //not similar - gives 0
    ```
More:
https://github.com/k255/drill-fuzzy-search
https://en.wikipedia.org/wiki/Levenshtein_distance

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/k255/drill drill-fuzzysearch

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/224.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #224


commit 51248358adf7ee71a744cccb7a22b45850f192a8
Author: potocki 
Date:   2015-10-30T18:54:41Z

basic similarity search with simmetric




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: [DRILL-3914]: support for geospatial query fun...

2015-10-16 Thread k255
Github user k255 commented on a diff in the pull request:

https://github.com/apache/drill/pull/191#discussion_r42217546
  
--- Diff: 
contrib/gis/src/main/java/org/apache/drill/exec/expr/fn/impl/gis/STAsText.java 
---
@@ -0,0 +1,58 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.expr.fn.impl.gis;
+
+import javax.inject.Inject;
+
+import org.apache.drill.exec.expr.DrillSimpleFunc;
+import org.apache.drill.exec.expr.annotations.FunctionTemplate;
+import org.apache.drill.exec.expr.annotations.Output;
+import org.apache.drill.exec.expr.annotations.Param;
+import org.apache.drill.exec.expr.holders.VarBinaryHolder;
+import org.apache.drill.exec.expr.holders.VarCharHolder;
+
+import io.netty.buffer.DrillBuf;
+
+@FunctionTemplate(name = "st_astext", scope = 
FunctionTemplate.FunctionScope.SIMPLE,
+  nulls = FunctionTemplate.NullHandling.NULL_IF_NULL)
+public class STAsText implements DrillSimpleFunc {
+  @Param
+  VarBinaryHolder geom1Param;
+
+  @Output
+  VarCharHolder out;
+
+  @Inject
+  DrillBuf buffer;
+
+  public void setup() {
+  }
+
+  public void eval() {
+com.esri.core.geometry.ogc.OGCGeometry geom1 = 
com.esri.core.geometry.ogc.OGCGeometry
+.fromBinary(geom1Param.buffer.nioBuffer(geom1Param.start, 
geom1Param.end));
+
+String geomWKT = geom1.asText();
+
+int outputSize = geomWKT.getBytes().length;
+out.buffer = buffer.reallocIfNeeded(outputSize);
--- End diff --

Thanks, this was helpful. You're right the next executions failed with 
"Tried to remove unmanaged buffer.". Now it's fixed. Is it also valid to use 
BufferManager.getManagedBuffer(size) somehow (maybe instead of injecting the 
DrillBuf)?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: [DRILL-3914]: support for geospatial query fun...

2015-10-15 Thread k255
Github user k255 commented on the pull request:

https://github.com/apache/drill/pull/191#issuecomment-148369490
  
Fixed bug with complex geometries caused by to small buffer. Now it's 
possible to build more complex geometries.Would be nice if somebody could check 
the way I handled it (buffer reallocation).
I also have some progress on integration with gis tools as shown here: 
http://bit.ly/1Rcvrjd


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: [DRILL-3914]: support for geospatial query fun...

2015-10-08 Thread k255
Github user k255 commented on the pull request:

https://github.com/apache/drill/pull/191#issuecomment-146679625
  
I added some general tests to check if geometry functions work as expected.

I'm happy that you like it. Currently it's quite simple but it can grow.
One direction is to take care of limited size of varbinary (introduce new 
type or extend size of existing one) because it limits geometry to just simple 
shapes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: [DRILL-3914]: support for geospatial query fun...

2015-10-08 Thread k255
GitHub user k255 opened a pull request:

https://github.com/apache/drill/pull/191

[DRILL-3914]: support for geospatial query functionality

Sample dataset is provided on classpath, after building from fork 
repository, you can query it like:
select * from cp.`sample-data/CA-cities.csv` limit 5;

For details on current geospatial functionality please see:
https://github.com/k255/drill-gis

Currently the solution works on common use cases, but is based on varbinary 
data type which has limitations for more complex geometries (size limit).

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/k255/drill master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/191.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #191


commit dc19cb732645b2d168f04eec521848992807cf07
Author: potocki 
Date:   2015-10-08T06:10:33Z

gis contrib module with basic spatial queries functionality




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---