Self bug-report:
IN: scratchpad [ 1 2 3 ] [ 1 ] similarity .
1
Oh well. Suppose we'd need
: similarity ( def1 def2 -- score )
[ weighted-number-of-shared-nodes ]
[
[ max-length ]
[ [ number-of-different-nodes ] 2map-sum ] 2bi -
] 2bi over + / ;
But again, a little ambiguous (treating sequences as n-ary trees). Just a
heuristic, I guess.
--Alex Vondrak
On Wed, Apr 10, 2013 at 7:51 PM, Alex Vondrak <ajvond...@gmail.com> wrote:
> In case anyone's interested, attached is my interpretation of the "tree
> similarity" metric given in the paper I linked. The definition was
> somewhat vague, so I just did what I thought made sense.
>
> IN: scratchpad \ move-to-file \ move-to-dir word-similarity .
> 35/39
> IN: scratchpad \ move-to-file \ usage word-similarity .
> 0
> IN: scratchpad \ move-to-dir \ move-to-dir word-similarity .
> 1
>
> It would be interesting to implement the rest of the algorithm. See how
> it does in Factor.
>
> Regards,
> --Alex Vondrak
>
>
>
> On Wed, Apr 10, 2013 at 6:35 PM, John Benediktsson <mrj...@gmail.com>wrote:
>
>> You don't really want a flattened intersection:
>>
>> A word definition like this:
>>
>> : foo ( x -- x ) [ 2^ ] [ bitor ] bi ;
>>
>> Shouldn't match ``set-bit``:
>>
>> : set-bit ( x n -- y ) 2^ bitor ; inline
>>
>> You probably want something that does something like a deep-each, then
>> for each subsequence, collecting any subsequence that is a duplicate of all
>> possible subsequences of all quotations, or something ambitious like that.
>>
>> In the lint vocabulary, the lint word looks at all callable's trying to
>> find any definition that includes it as a subsequence:
>>
>> GENERIC: lint ( obj -- seq )
>>
>> M: callable lint
>> [ lint-definitions-keys get-global ] dip [ subseq? ] curry
>> filter ;
>>
>> M: object lint drop f ;
>>
>> M: word lint
>> def>> [ callable? ] deep-filter [ lint ] map concat ;
>>
>>
>>
>>
>>
>> On Wed, Apr 10, 2013 at 5:19 PM, leonard <leonard14...@gmail.com> wrote:
>>
>>> On Wed, Apr 10, 2013 at 2:33 PM, John Benediktsson <mrj...@gmail.com>wrote:
>>>
>>>> You should really look at how the lint tool works.
>>>>
>>>> In particular, look at "lint" and see how it looks for a word which has
>>>> a definition that is contained in another word (where the second word
>>>> should be calling the first instead of duplicating its definition).
>>>>
>>>> Your version could look for common subsequences instead, perhaps.
>>>>
>>>
>>> Is there a word for calculating the intersection of two deep sequences?
>>>
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> Precog is a next-generation analytics platform capable of advanced
>>> analytics on semi-structured data. The platform includes APIs for
>>> building
>>> apps and a phenomenal toolset for data science. Developers can use
>>> our toolset for easy data analysis & visualization. Get a free account!
>>> http://www2.precog.com/precogplatform/slashdotnewsletter
>>> _______________________________________________
>>> Factor-talk mailing list
>>> Factor-talk@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/factor-talk
>>>
>>>
>>
>>
>> ------------------------------------------------------------------------------
>> Precog is a next-generation analytics platform capable of advanced
>> analytics on semi-structured data. The platform includes APIs for building
>> apps and a phenomenal toolset for data science. Developers can use
>> our toolset for easy data analysis & visualization. Get a free account!
>> http://www2.precog.com/precogplatform/slashdotnewsletter
>> _______________________________________________
>> Factor-talk mailing list
>> Factor-talk@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/factor-talk
>>
>>
>
------------------------------------------------------------------------------
Precog is a next-generation analytics platform capable of advanced
analytics on semi-structured data. The platform includes APIs for building
apps and a phenomenal toolset for data science. Developers can use
our toolset for easy data analysis & visualization. Get a free account!
http://www2.precog.com/precogplatform/slashdotnewsletter
_______________________________________________
Factor-talk mailing list
Factor-talk@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/factor-talk