Re: [Jprogramming] Vector Similarity

2018-03-01 Thread 'Bo Jacoby' via Programming
Thank you Skip! But I do not know how to make a  .eduAdvanced Search . /Bo. Den 22:18 onsdag den 28. februar 2018 skrev Skip Cave : No problem. I was able to download it from here: https://www.academia.edu/10031088/ORDINAL_FRACTIONS_-_the_algebra_of_data Skip C

Re: [Jprogramming] Vector Similarity

2018-02-28 Thread Skip Cave
No problem. I was able to download it from here: https://www.academia.edu/10031088/ORDINAL_FRACTIONS_-_the_algebra_of_data Skip Cave Cave Consulting LLC On Wed, Feb 28, 2018 at 3:04 PM, Martin Kreuzer wrote: > Skip - > > The paper can be downloaded from this link > https://www.academia.edu/peop

Re: [Jprogramming] Vector Similarity

2018-02-28 Thread Martin Kreuzer
Skip - The paper can be downloaded from this link https://www.academia.edu/people/search?utf8=%E2%9C%93&q=bo+jacoby+ordinal+fractions provided you have an account to log into ... If that's not working out -and Bo agrees- I could send you the short paper by PM. -M At 2018-02-28 14:25, you wro

Re: [Jprogramming] Vector Similarity

2018-02-28 Thread Skip Cave
Bo, .edu Advanced Search found 9,227papers containing “ORDINAL FRACTIONS” Search within the full text of 20 million papers Skip Cave Cave Consulting LLC On Tue, Feb 20, 2018 at 4:20 PM, 'Bo Jacoby' via Programming < programm...@jsoftware.com> wrote: > ORDINAL FRACTIO

Re: [Jprogramming] Vector Similarity

2018-02-21 Thread R.E. Boss
Why not submit it to arXiv.org? R.E. Boss > -Original Message- > From: Programming [mailto:programming-boun...@forums.jsoftware.com] > On Behalf Of 'Bo Jacoby' via Programming > Sent: woensdag 21 februari 2018 09:07 > To: programm...@jsoftware.com > Subjec

Re: [Jprogramming] Vector Similarity

2018-02-21 Thread Raul Miller
Sure, you can represent words that way That's basically the same kind of representation that you get encoding each word as a unique integer (index). Just less space efficient. So, for example, in J, if you had 8 words, you could represent the unique sequence of words as i.8 or you cou

Re: [Jprogramming] Vector Similarity

2018-02-21 Thread Skip Cave
Raul, Well, one way to start word2vec* is to* assign a boolean vector to each word, with a single boolean one in a different place for each unique word. That's why it's called 'one hot' embedding. However, after training the word set with a shallow, two-layer neural network, and doing significant

Re: [Jprogramming] Vector Similarity

2018-02-21 Thread Raul Miller
Skip, Are you sure you have the word2vec description right? https://en.wikipedia.org/wiki/Word2vec claims the dimensionality of word2vec is typically in the range of 100 . .. 1000, which would allow treatment of a rather limited vocabulary if each dimension corresponded to a distinct word. The i

Re: [Jprogramming] Vector Similarity

2018-02-21 Thread 'Bo Jacoby' via Programming
Thank you, Skip! I tried to publish an Ordinal Fraction article in wikipedia, but it was removed because original research is not allowed in wikipedia. However somebody copied it into this link: StateMaster - Encyclopedia: Ordinal fraction . I think that the most obvious use of ordinal fractions

Re: [Jprogramming] Vector Similarity

2018-02-20 Thread Skip Cave
​​Bo, I read your paper "Ordinal Fractions " paper. In your paper you propose ordinal fractions as a system for numerically defining data categories and indices, along with an associated algebra for manipulating th

Re: [Jprogramming] Vector Similarity

2018-02-20 Thread Raul Miller
Oops, sorry about that. Here's a fixed implementation: 1 1 1 (prod % %:@*&(prod~)) 0 3 3 0.816497 Thanks, -- Raul On Tue, Feb 20, 2018 at 4:41 PM, Skip Cave wrote: > Very nice! Thanks Raul. > > However, there is something wrong about the cosine similarity, > which should always be betwee

Re: [Jprogramming] Vector Similarity

2018-02-20 Thread 'Bo Jacoby' via Programming
ORDINAL FRACTIONS - the algebra of data | | | | || | | | || ORDINAL FRACTIONS - the algebra of data This paper was submitted to the 10th World Computer Congress, IFIP 1986 conference, but rejected by the referee | | | | Den 22:42 tirsdag den

Re: [Jprogramming] Vector Similarity

2018-02-20 Thread Skip Cave
Very nice! Thanks Raul. However, there is something wrong about the cosine similarity, which should always be between 0 & 1 prod=:+/ .* 1 1 1 (prod % %:@*&prod) 0 3 3 1.41421 ​Skip On Tue, Feb 20, 2018 at 2:27 PM, Raul Miller wrote: > I don't know about blog entries - I think there are prob

Re: [Jprogramming] Vector Similarity

2018-02-20 Thread Raul Miller
I don't know about blog entries - I think there are probably some that partially cover this topic. But it shouldn't be hard to implement most of these operations: Euclidean distance: 1 0 0 +/&.:*:@:- 0 1 0 1.41421 Manhattan distance: 1 0 0 +/@:|@:- 0 1 0 2 Minkowski distances: minko

[Jprogramming] Vector Similarity

2018-02-20 Thread Skip Cave
One of the hottest topics in data science today is the representation of data characteristics using large multi-dimensional arrays. Each datum is represented as a data point or multi-element vector in an array that can have hundreds of dimensions. In these arrays, each dimension represents a differ