Craig,
how do you map _all_ properties to the integer (or rather numeric/long?) space.

Does the mapping then also rely on the alphabetic (or comparision) order of the 
properties?

Interesting approach. Is your space then as n-dimensional as the numbers of 
properties you have?

Cheers

Michael

Am 02.02.2011 um 00:59 schrieb Craig Taverner:

> Here is a crazy idea - how about taking the properties you care about and
> dropping them into a combined lucene index? Then all results for nodes with
> the same properties would be 'ambiguous'. Moving this forward to degrees of
> ambiguity might be possible by creating the combined 'value' using a reduced
> resolution of the properties (to increase similarity so the index will see
> them as identical).
> 
> Another option is the 'still very much in progress' composite index I
> started in December. Since all properties are mapped into normal integer
> space, the euclidean distance of the first level index nodes from each other
> is a discrete measure of similarity. A distance of zero means that the nodes
> attach to the same index node, and are very similar or identical. Higher
> values mean greater dissimilarity. This index theoretically supports any
> number of properties of any type (including strings) and allows you to plug
> in your own value->index mappers, which means you can control what you mean
> by 'similar'.
> 
> On Tue, Feb 1, 2011 at 9:52 PM, Ben Sand <b...@bensand.com> wrote:
> 
>> I was working on a project that used matching algorithms a while back.
>> 
>> What you have is an n-dimensional matching problem. I can't remember
>> specifically what the last project were using, but this and the linked
>> algos
>> may be what you're looking for:
>> http://en.wikipedia.org/wiki/Mahalanobis_distance
>> 
>> On 2 February 2011 07:34, Tim McNamara <paperl...@timmcnamara.co.nz>
>> wrote:
>> 
>>> Say I have two nodes,
>>> 
>>> 
>>> { "type": "person", "name": "Neo" }
>>> { "type": "person", "name": "Neo" }
>>> 
>>> 
>>> 
>>> Over time, I learn their locations. They both live in the same city. This
>>> increases the chances that they're the same person. However, over time it
>>> turns out that their ages differ, therefore it's far less likely that
>> they
>>> are the same Neo.
>>> 
>>> 
>>> Is there anything inside of Neo4j that attempts to determine how close
>> two
>>> nodes are? E.g. to what extent their subtrees and properties match?
>>> Additionally, can anyone suggest literature for algorithms for
>>> disambiguating the two entities?
>>> 
>>> 
>>> If I wanted to implement something that searches for similarities, that
>>> returns a probability of a match, can I do this within the database or
>>> should I implement it within the application?
>>> 
>>> 
>>> --
>>> Tim McNamara
>>> @timClicks
>>> http://timmcnamara.co.nz
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> Neo4j mailing list
>>> User@lists.neo4j.org
>>> https://lists.neo4j.org/mailman/listinfo/user
>>> 
>> _______________________________________________
>> Neo4j mailing list
>> User@lists.neo4j.org
>> https://lists.neo4j.org/mailman/listinfo/user
>> 
> _______________________________________________
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user

_______________________________________________
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Reply via email to