[jira] [Updated] (LUCENE-7110) Add Shape Support to BKD (extend to an R*/X-Tree data structure)

2018-01-15 Thread Nicholas Knize (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicholas Knize updated LUCENE-7110:
---
Description: 
I've been tinkering with this off and on for a while and its showing some 
promise so I'm going to open an issue to (eventually) add this feature to a 
future release.

R*/X-Tree is a data structure designed to support Shapes (2D, 3D, nD) where, 
like the internal node, the key for each leaf node is the Minimum Bounding 
Range (MBR - sometimes "incorrectly" referred to as Minimum Bounding Rectangle) 
of the shape. Inserting a shape then boils down to the best way of optimizing 
the tree structure. This optimization is driven by a set of criteria for 
choosing the appropriate internal key (e.g., minimizing overlap between 
siblings, maximizing "squareness", minimizing area, maximizing space usage). 
Query is then (a bit oversimplified) a two-phase process:
 * recurse each branch that overlaps with the MBR of the query shape
 * compute the relation with the leaf node(s) - in higher dimensions (3+) this 
becomes an increasingly difficult computational geometry problem

The current BKD implementation is a special simplified case of an R*/X tree 
where, for Point data, it is always guaranteed there will never be overlap 
between sibling nodes (because you're using the point data as the keys). By 
exploiting this property the tree algorithms (split, merge, etc) are relatively 
cheap (hence their performance boost over postings based numerics). By 
modifying the key data, and extending the tree generation algorithms BKD logic 
can be extended to support Shape data using the MBR as the Key and modifying 
split and merge based on the criteria needed for optimizing a shape-based data 
structure.

The initial implementation (based on limitations of the GeoAPI) will support 2D 
shapes only. Once the GeoAPI can performantly handle 3D shapes the change is 
relatively trivial to add the third dimension to the tree generation code.

Like everything else, this feature will be created in sandbox and, once mature, 
will graduate (likely to lucene-spatial or lucene-spatial-extras depending on 
the library needs).

  was:
I've been tinkering with this off and on for a while and its showing some 
promise so I'm going to open an issue to (eventually) add this feature to 
either a 6.x or (more likely) a 7.x release.

R*/X-Tree is a data structure designed to support Shapes (2D, 3D, nD) where, 
like the internal node, the key for each leaf node is the Minimum Bounding 
Range (MBR - sometimes "incorrectly" referred to as Minimum Bounding Rectangle) 
of the shape. Inserting a shape then boils down to the best way of optimizing 
the tree structure. This optimization is driven by a set of criteria for 
choosing the appropriate internal key (e.g., minimizing overlap between 
siblings, maximizing "squareness", minimizing area, maximizing space usage). 
Query is then (a bit oversimplified) a two-phase process:
* recurse each branch that overlaps with the MBR of the query shape
* compute the relation with the leaf node(s) - in higher dimensions (3+) this 
becomes an increasingly difficult computational geometry problem

The current BKD implementation is a special simplified case of an R*/X tree 
where, for Point data, it is always guaranteed there will never be overlap 
between sibling nodes (because you're using the point data as the keys). By 
exploiting this property the tree algorithms (split, merge, etc) are relatively 
cheap (hence their performance boost over postings based numerics). By 
modifying the key data, and extending the tree generation algorithms BKD logic 
can be extended to support Shape data using the MBR as the Key and modifying 
split and merge based on the criteria needed for optimizing a shape-based data 
structure.

The initial implementation (based on limitations of the GeoAPI) will support 2D 
shapes only. Once the GeoAPI can performantly handle 3D shapes the change is 
relatively trivial to add the third dimension to the tree generation code.

Like everything else, this feature will be created in sandbox and, once mature, 
will graduate (likely to lucene-spatial or lucene-spatial-extras depending on 
the library needs).


> Add Shape Support to BKD (extend to an R*/X-Tree data structure)
> 
>
> Key: LUCENE-7110
> URL: https://issues.apache.org/jira/browse/LUCENE-7110
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Nicholas Knize
>Priority: Major
>
> I've been tinkering with this off and on for a while and its showing some 
> promise so I'm going to open an issue to (eventually) add this feature to a 
> future release.
> R*/X-Tree is a data structure designed to support Shapes (2D, 3D, nD) where, 
> like the internal node, the key for each leaf 

[jira] [Updated] (LUCENE-7110) Add Shape Support to BKD (extend to an R*/X-Tree data structure)

2016-03-19 Thread Nicholas Knize (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicholas Knize updated LUCENE-7110:
---
Description: 
I've been tinkering with this off and on for a while and its showing some 
promise so I'm going to open an issue to (eventually) add this feature to 
either a 6.x or (more likely) a 7.x release.

R*/X-Tree is a data structure designed to support Shapes (2D, 3D, nD) where, 
like the internal node, the key for each leaf node is the Minimum Bounding 
Range (MBR - sometimes "incorrectly" referred to as Minimum Bounding Rectangle) 
of the shape. Inserting a shape then boils down to the best way of optimizing 
the tree structure. This optimization is driven by a set of criteria for 
choosing the appropriate internal key (e.g., minimizing overlap between 
siblings, maximizing "squareness", minimizing area, maximizing space usage). 
Query is then (a bit oversimplified) a two-phase process:
* recurse each branch that overlaps with the MBR of the query shape
* compute the relation with the leaf node(s) - in higher dimensions (3+) this 
becomes an increasingly difficult computational geometry problem

The current BKD implementation is a special simplified case of an R*/X tree 
where, for Point data, it is always guaranteed there will be no overlap between 
sibling nodes (because you're using the point data as the keys). By exploiting 
this property the tree algorithms (split, merge, etc) are relatively cheap 
(hence their performance boost over postings based numerics). By modifying the 
key data, and extending the tree generation algorithms BKD logic can be 
extended to support Shape data using the MBR as the Key and modifying split and 
merge based on the criteria needed for optimizing a shape-based data structure.

The initial implementation (based on limitations of the GeoAPI) will support 2D 
shapes only. Once the GeoAPI can performantly handle 3D shapes the change is 
relatively trivial to add the third dimension to the tree generation code.

Like everything else, this feature will be created in sandbox and, once mature, 
will graduate to lucene-spatial.

  was:
I've been tinkering with this off and on for a while and its showing some 
promise so I'm going to open an issue to (eventually) add this feature to 
either a 6.x or (more likely) a 7.x release.

R*/X-Tree is a data structure designed to support Shapes (2D, 3D, nD) where, 
like the internal node, the key for each leaf node is the Minimum Bounding 
Range (MBR - sometimes "incorrectly" referred to as Minimum Bounding Rectangle) 
of the shape. Inserting a shape then boils down to the best way of optimizing 
the tree structure. This optimization is driven by a set of criteria for 
choosing the appropriate internal key (e.g., minimizing overlap between 
siblings, maximizing "squareness", minimizing area, maximizing space usage). 
Query is then (a bit oversimplified) a two-phase process:
* recurse each branch that overlaps with the MBR of the query shape
* compute the relation with the leaf node(s) - in higher dimensions (3+) this 
becomes an increasingly difficult computational geometry problem
The current BKD implementation is a special simplified case of an R*/X tree 
where, for Point data, it is always guaranteed there will be no overlap between 
sibling nodes (because you're using the point data as the keys). By exploiting 
this property the tree algorithms (split, merge, etc) are relatively cheap 
(hence their performance boost over postings based numerics). By modifying the 
key data, and extending the tree generation algorithms BKD logic can be 
extended to support Shape data using the MBR as the Key and modifying split and 
merge based on the criteria needed for optimizing a shape-based data structure.

The initial implementation (based on limitations of the GeoAPI) will support 2D 
shapes only. Once the GeoAPI can performantly handle 3D shapes the change is 
relatively trivial to add the third dimension to the tree generation code.

Like everything else, this feature will be created in sandbox and, once mature, 
will graduate to lucene-spatial.


> Add Shape Support to BKD (extend to an R*/X-Tree data structure)
> 
>
> Key: LUCENE-7110
> URL: https://issues.apache.org/jira/browse/LUCENE-7110
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Nicholas Knize
>
> I've been tinkering with this off and on for a while and its showing some 
> promise so I'm going to open an issue to (eventually) add this feature to 
> either a 6.x or (more likely) a 7.x release.
> R*/X-Tree is a data structure designed to support Shapes (2D, 3D, nD) where, 
> like the internal node, the key for each leaf node is the Minimum Bounding 
> Range (MBR - sometimes "incorrectly" referred to as Minimum Bounding 
> Rectangle) of 

[jira] [Updated] (LUCENE-7110) Add Shape Support to BKD (extend to an R*/X-Tree data structure)

2016-03-19 Thread Nicholas Knize (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicholas Knize updated LUCENE-7110:
---
Description: 
I've been tinkering with this off and on for a while and its showing some 
promise so I'm going to open an issue to (eventually) add this feature to 
either a 6.x or (more likely) a 7.x release.

R*/X-Tree is a data structure designed to support Shapes (2D, 3D, nD) where, 
like the internal node, the key for each leaf node is the Minimum Bounding 
Range (MBR - sometimes "incorrectly" referred to as Minimum Bounding Rectangle) 
of the shape. Inserting a shape then boils down to the best way of optimizing 
the tree structure. This optimization is driven by a set of criteria for 
choosing the appropriate internal key (e.g., minimizing overlap between 
siblings, maximizing "squareness", minimizing area, maximizing space usage). 
Query is then (a bit oversimplified) a two-phase process:
* recurse each branch that overlaps with the MBR of the query shape
* compute the relation with the leaf node(s) - in higher dimensions (3+) this 
becomes an increasingly difficult computational geometry problem

The current BKD implementation is a special simplified case of an R*/X tree 
where, for Point data, it is always guaranteed there will never be overlap 
between sibling nodes (because you're using the point data as the keys). By 
exploiting this property the tree algorithms (split, merge, etc) are relatively 
cheap (hence their performance boost over postings based numerics). By 
modifying the key data, and extending the tree generation algorithms BKD logic 
can be extended to support Shape data using the MBR as the Key and modifying 
split and merge based on the criteria needed for optimizing a shape-based data 
structure.

The initial implementation (based on limitations of the GeoAPI) will support 2D 
shapes only. Once the GeoAPI can performantly handle 3D shapes the change is 
relatively trivial to add the third dimension to the tree generation code.

Like everything else, this feature will be created in sandbox and, once mature, 
will graduate to lucene-spatial.

  was:
I've been tinkering with this off and on for a while and its showing some 
promise so I'm going to open an issue to (eventually) add this feature to 
either a 6.x or (more likely) a 7.x release.

R*/X-Tree is a data structure designed to support Shapes (2D, 3D, nD) where, 
like the internal node, the key for each leaf node is the Minimum Bounding 
Range (MBR - sometimes "incorrectly" referred to as Minimum Bounding Rectangle) 
of the shape. Inserting a shape then boils down to the best way of optimizing 
the tree structure. This optimization is driven by a set of criteria for 
choosing the appropriate internal key (e.g., minimizing overlap between 
siblings, maximizing "squareness", minimizing area, maximizing space usage). 
Query is then (a bit oversimplified) a two-phase process:
* recurse each branch that overlaps with the MBR of the query shape
* compute the relation with the leaf node(s) - in higher dimensions (3+) this 
becomes an increasingly difficult computational geometry problem

The current BKD implementation is a special simplified case of an R*/X tree 
where, for Point data, it is always guaranteed there will be no overlap between 
sibling nodes (because you're using the point data as the keys). By exploiting 
this property the tree algorithms (split, merge, etc) are relatively cheap 
(hence their performance boost over postings based numerics). By modifying the 
key data, and extending the tree generation algorithms BKD logic can be 
extended to support Shape data using the MBR as the Key and modifying split and 
merge based on the criteria needed for optimizing a shape-based data structure.

The initial implementation (based on limitations of the GeoAPI) will support 2D 
shapes only. Once the GeoAPI can performantly handle 3D shapes the change is 
relatively trivial to add the third dimension to the tree generation code.

Like everything else, this feature will be created in sandbox and, once mature, 
will graduate to lucene-spatial.


> Add Shape Support to BKD (extend to an R*/X-Tree data structure)
> 
>
> Key: LUCENE-7110
> URL: https://issues.apache.org/jira/browse/LUCENE-7110
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Nicholas Knize
>
> I've been tinkering with this off and on for a while and its showing some 
> promise so I'm going to open an issue to (eventually) add this feature to 
> either a 6.x or (more likely) a 7.x release.
> R*/X-Tree is a data structure designed to support Shapes (2D, 3D, nD) where, 
> like the internal node, the key for each leaf node is the Minimum Bounding 
> Range (MBR - sometimes "incorrectly" referred to as Minimum Bounding 
> 

[jira] [Updated] (LUCENE-7110) Add Shape Support to BKD (extend to an R*/X-Tree data structure)

2016-03-19 Thread Nicholas Knize (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicholas Knize updated LUCENE-7110:
---
Description: 
I've been tinkering with this off and on for a while and its showing some 
promise so I'm going to open an issue to (eventually) add this feature to 
either a 6.x or (more likely) a 7.x release.

R*/X-Tree is a data structure designed to support Shapes (2D, 3D, nD) where, 
like the internal node, the key for each leaf node is the Minimum Bounding 
Range (MBR - sometimes "incorrectly" referred to as Minimum Bounding Rectangle) 
of the shape. Inserting a shape then boils down to the best way of optimizing 
the tree structure. This optimization is driven by a set of criteria for 
choosing the appropriate internal key (e.g., minimizing overlap between 
siblings, maximizing "squareness", minimizing area, maximizing space usage). 
Query is then (a bit oversimplified) a two-phase process:
* recurse each branch that overlaps with the MBR of the query shape
* compute the relation with the leaf node(s) - in higher dimensions (3+) this 
becomes an increasingly difficult computational geometry problem

The current BKD implementation is a special simplified case of an R*/X tree 
where, for Point data, it is always guaranteed there will never be overlap 
between sibling nodes (because you're using the point data as the keys). By 
exploiting this property the tree algorithms (split, merge, etc) are relatively 
cheap (hence their performance boost over postings based numerics). By 
modifying the key data, and extending the tree generation algorithms BKD logic 
can be extended to support Shape data using the MBR as the Key and modifying 
split and merge based on the criteria needed for optimizing a shape-based data 
structure.

The initial implementation (based on limitations of the GeoAPI) will support 2D 
shapes only. Once the GeoAPI can performantly handle 3D shapes the change is 
relatively trivial to add the third dimension to the tree generation code.

Like everything else, this feature will be created in sandbox and, once mature, 
will graduate (likely to lucene-spatial or lucene-spatial-extras depending on 
the library needs).

  was:
I've been tinkering with this off and on for a while and its showing some 
promise so I'm going to open an issue to (eventually) add this feature to 
either a 6.x or (more likely) a 7.x release.

R*/X-Tree is a data structure designed to support Shapes (2D, 3D, nD) where, 
like the internal node, the key for each leaf node is the Minimum Bounding 
Range (MBR - sometimes "incorrectly" referred to as Minimum Bounding Rectangle) 
of the shape. Inserting a shape then boils down to the best way of optimizing 
the tree structure. This optimization is driven by a set of criteria for 
choosing the appropriate internal key (e.g., minimizing overlap between 
siblings, maximizing "squareness", minimizing area, maximizing space usage). 
Query is then (a bit oversimplified) a two-phase process:
* recurse each branch that overlaps with the MBR of the query shape
* compute the relation with the leaf node(s) - in higher dimensions (3+) this 
becomes an increasingly difficult computational geometry problem

The current BKD implementation is a special simplified case of an R*/X tree 
where, for Point data, it is always guaranteed there will never be overlap 
between sibling nodes (because you're using the point data as the keys). By 
exploiting this property the tree algorithms (split, merge, etc) are relatively 
cheap (hence their performance boost over postings based numerics). By 
modifying the key data, and extending the tree generation algorithms BKD logic 
can be extended to support Shape data using the MBR as the Key and modifying 
split and merge based on the criteria needed for optimizing a shape-based data 
structure.

The initial implementation (based on limitations of the GeoAPI) will support 2D 
shapes only. Once the GeoAPI can performantly handle 3D shapes the change is 
relatively trivial to add the third dimension to the tree generation code.

Like everything else, this feature will be created in sandbox and, once mature, 
will graduate to lucene-spatial.


> Add Shape Support to BKD (extend to an R*/X-Tree data structure)
> 
>
> Key: LUCENE-7110
> URL: https://issues.apache.org/jira/browse/LUCENE-7110
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Nicholas Knize
>
> I've been tinkering with this off and on for a while and its showing some 
> promise so I'm going to open an issue to (eventually) add this feature to 
> either a 6.x or (more likely) a 7.x release.
> R*/X-Tree is a data structure designed to support Shapes (2D, 3D, nD) where, 
> like the internal node, the key for each leaf node is the Minimum Bounding 
> Range (MBR