[jira] [Commented] (LUCENE-5408) SerializedDVStrategy -- match geometries in DocValues
[ https://issues.apache.org/jira/browse/LUCENE-5408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13902806#comment-13902806 ] ASF subversion and git services commented on LUCENE-5408: - Commit 1568818 from [~dsmiley] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1568818 ] LUCENE-5408: Spatial SerializedDVStrategy > SerializedDVStrategy -- match geometries in DocValues > - > > Key: LUCENE-5408 > URL: https://issues.apache.org/jira/browse/LUCENE-5408 > Project: Lucene - Core > Issue Type: New Feature > Components: modules/spatial >Reporter: David Smiley >Assignee: David Smiley > Fix For: 4.7 > > Attachments: LUCENE-5408_GeometryStrategy.patch, > LUCENE-5408_SerializedDVStrategy.patch > > > I've started work on a new SpatialStrategy implementation I'm tentatively > calling SerializedDVStrategy. It's similar to the [JtsGeoStrategy in > Spatial-Solr-Sandbox|https://github.com/ryantxu/spatial-solr-sandbox/tree/master/LSE/src/main/java/org/apache/lucene/spatial/pending/jts] > but a little different in the details -- certainly faster. Using Spatial4j > 0.4's BinaryCodec, it'll serialize the shape to bytes (for polygons this in > internally WKB format) and the strategy will put it in a > BinaryDocValuesField. In practice the shape is likely a polygon but it > needn't be. Then I'll implement a Filter that returns a DocIdSetIterator > that evaluates a given document passed via advance(docid)) to see if the > query shape matches a shape in DocValues. It's improper usage for it to be > used in a situation where it will evaluate every document id via nextDoc(). > And in practice the DocValues format chosen should be a disk resident one > since each value tends to be kind of big. > This spatial strategy in and of itself has no _index_; it's O(N) where N is > the number of documents that get passed thru it. So it should be placed last > in the query/filter tree so that the other queries limit the documents it > needs to see. At a minimum, another query/filter to use in conjunction is > another SpatialStrategy like RecursivePrefixTreeStrategy. > Eventually once the PrefixTree grid encoding has a little bit more metadata, > it will be possible to further combine the grid & this strategy in such a way > that many documents won't need to be checked against the serialized geometry. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5408) SerializedDVStrategy -- match geometries in DocValues
[ https://issues.apache.org/jira/browse/LUCENE-5408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13902805#comment-13902805 ] ASF subversion and git services commented on LUCENE-5408: - Commit 1568817 from [~dsmiley] in branch 'dev/trunk' [ https://svn.apache.org/r1568817 ] LUCENE-5408: fixed tests; some strategies require DocValues > SerializedDVStrategy -- match geometries in DocValues > - > > Key: LUCENE-5408 > URL: https://issues.apache.org/jira/browse/LUCENE-5408 > Project: Lucene - Core > Issue Type: New Feature > Components: modules/spatial >Reporter: David Smiley >Assignee: David Smiley > Fix For: 4.7 > > Attachments: LUCENE-5408_GeometryStrategy.patch, > LUCENE-5408_SerializedDVStrategy.patch > > > I've started work on a new SpatialStrategy implementation I'm tentatively > calling SerializedDVStrategy. It's similar to the [JtsGeoStrategy in > Spatial-Solr-Sandbox|https://github.com/ryantxu/spatial-solr-sandbox/tree/master/LSE/src/main/java/org/apache/lucene/spatial/pending/jts] > but a little different in the details -- certainly faster. Using Spatial4j > 0.4's BinaryCodec, it'll serialize the shape to bytes (for polygons this in > internally WKB format) and the strategy will put it in a > BinaryDocValuesField. In practice the shape is likely a polygon but it > needn't be. Then I'll implement a Filter that returns a DocIdSetIterator > that evaluates a given document passed via advance(docid)) to see if the > query shape matches a shape in DocValues. It's improper usage for it to be > used in a situation where it will evaluate every document id via nextDoc(). > And in practice the DocValues format chosen should be a disk resident one > since each value tends to be kind of big. > This spatial strategy in and of itself has no _index_; it's O(N) where N is > the number of documents that get passed thru it. So it should be placed last > in the query/filter tree so that the other queries limit the documents it > needs to see. At a minimum, another query/filter to use in conjunction is > another SpatialStrategy like RecursivePrefixTreeStrategy. > Eventually once the PrefixTree grid encoding has a little bit more metadata, > it will be possible to further combine the grid & this strategy in such a way > that many documents won't need to be checked against the serialized geometry. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5408) SerializedDVStrategy -- match geometries in DocValues
[ https://issues.apache.org/jira/browse/LUCENE-5408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13902783#comment-13902783 ] ASF subversion and git services commented on LUCENE-5408: - Commit 1568807 from [~dsmiley] in branch 'dev/trunk' [ https://svn.apache.org/r1568807 ] LUCENE-5408: Spatial SerializedDVStrategy > SerializedDVStrategy -- match geometries in DocValues > - > > Key: LUCENE-5408 > URL: https://issues.apache.org/jira/browse/LUCENE-5408 > Project: Lucene - Core > Issue Type: New Feature > Components: modules/spatial >Reporter: David Smiley >Assignee: David Smiley > Fix For: 4.7 > > Attachments: LUCENE-5408_GeometryStrategy.patch, > LUCENE-5408_SerializedDVStrategy.patch > > > I've started work on a new SpatialStrategy implementation I'm tentatively > calling SerializedDVStrategy. It's similar to the [JtsGeoStrategy in > Spatial-Solr-Sandbox|https://github.com/ryantxu/spatial-solr-sandbox/tree/master/LSE/src/main/java/org/apache/lucene/spatial/pending/jts] > but a little different in the details -- certainly faster. Using Spatial4j > 0.4's BinaryCodec, it'll serialize the shape to bytes (for polygons this in > internally WKB format) and the strategy will put it in a > BinaryDocValuesField. In practice the shape is likely a polygon but it > needn't be. Then I'll implement a Filter that returns a DocIdSetIterator > that evaluates a given document passed via advance(docid)) to see if the > query shape matches a shape in DocValues. It's improper usage for it to be > used in a situation where it will evaluate every document id via nextDoc(). > And in practice the DocValues format chosen should be a disk resident one > since each value tends to be kind of big. > This spatial strategy in and of itself has no _index_; it's O(N) where N is > the number of documents that get passed thru it. So it should be placed last > in the query/filter tree so that the other queries limit the documents it > needs to see. At a minimum, another query/filter to use in conjunction is > another SpatialStrategy like RecursivePrefixTreeStrategy. > Eventually once the PrefixTree grid encoding has a little bit more metadata, > it will be possible to further combine the grid & this strategy in such a way > that many documents won't need to be checked against the serialized geometry. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org