[julia-users] ANN: Glove.jl

2015-09-29 Thread Dom Luna
Hey all,

I started this a couple of months back but ran into a couple of issues and 
sort of just had it on the backburner. I worked on it these last couple of 
days and I've gotten it to a usable state.

https://github.com/domluna/Glove.jl

I haven't done anything to make it parallel yet, that would be the next big 
performance win.

Any tips or contributions to make it better would be more than welcome :)


[julia-users] Re: ANN: Glove.jl

2015-09-29 Thread Dom Luna
Thanks Kevin. This slipped my mind.

Glove (or rather GloVe, Glove is just easier to type) stands for Global 
Word Vectors. The package implements the algorithm in 
described http://nlp.stanford.edu/projects/glove/. The idea is to represent 
words as a vector of floats that capture word similarities. For example, 
"king - man + woman = queen" (operating on word vectors). The other popular 
implementation is word2vec, https://code.google.com/p/word2vec/.


[julia-users] Re: Julia Summer of Code

2015-05-27 Thread Dom Luna
I'd be interested in bridging Julia and Torch. I believe this has been 
thought about before https://github.com/soumith/NeuralNetworks.jl. What are 
the challenges to starting some work on this?

If not I'd like to work on the JuliaWeb project. I was watching the GoSF 
meeting last night via live stream and in Go 1.5 (master) shared libraries 
are available. This got me thinking it would be cool to interface between 
Julia and Go. Go is known for its server capabilities so leveraging this 
could be very useful. I realize this sounds a bit crazy. It's an idea I've 
had for a little while now that just perhaps became viable. Go runs on a 
boatload of architectures so that shouldn't be a problem either.
On Friday, May 15, 2015 at 1:57:24 PM UTC-4, Viral Shah wrote:

 Folks,

 The Moore Foundation is generously funding us to allow for 6-8 Julia 
 Summer of Code projects. Details will be published soon, but if you are 
 interested, please mark your calendars and plan your projects.

 -viral



[julia-users] Avoid memory allocations when reading from matrices

2015-05-23 Thread Dom Luna
Reposting this from Gitter chat since it seems this is more active.

I'm writing a GloVe module to learn Julia.

How can I avoid memory allocations? My main function deals with a lot of 
random indexing in Matrices.

A[i,  :] = 0.5 * B[i, :]

In this case* i* isn't from a linear sequence. I'm not sure that matters. 
Anyway, I've done analysis and I know B[i, :]  is the issue here since it's 
creating a copy. 

https://github.com/JuliaLang/julia/blob/master/base/array.jl#L309 makes the 
copy


I tried to do it via loop but it looks like that doesn't help either. In 
fact, it seems to allocate slight more memory which seems really odd.

Here's some of the code, it's a little messy since I'm commenting different 
approaches I'm trying out.

type Model{T}
W_main::Matrix{T}
W_ctx::Matrix{T}
b_main::Vector{T}
b_ctx::Vector{T}
W_main_grad::Matrix{T}
W_ctx_grad::Matrix{T}
b_main_grad::Vector{T}
b_ctx_grad::Vector{T}
covec::Vector{Cooccurence}
end

# Each vocab word in associated with a main vector and a context vector.
# The paper initializes the to values [-0.5, 0.5] / vecsize+1 and
# the gradients to 1.0.
#
# The +1 term is for the bias.
function Model(comatrix; vecsize=100)
vs = size(comatrix, 1)
Model(
(rand(vecsize, vs) - 0.5) / (vecsize + 1),
(rand(vecsize, vs) - 0.5) / (vecsize + 1),
(rand(vs) - 0.5) / (vecsize + 1),
(rand(vs) - 0.5) / (vecsize + 1),
ones(vecsize, vs),
ones(vecsize, vs),
ones(vs),
ones(vs),
CoVector(comatrix), # not required in 0.4
)
end

# TODO: figure out memory issue
# the memory comments are from 500 loop test with vecsize=100
function train!(m::Model, s::Adagrad; xmax=100, alpha=0.75)
J = 0.0
shuffle!(m.covec)

vecsize = size(m.W_main, 1)
eltype = typeof(m.b_main[1])
vm = zeros(eltype, vecsize)
vc = zeros(eltype, vecsize)
grad_main = zeros(eltype, vecsize)
grad_ctx = zeros(eltype, vecsize)

for n=1:s.niter
# shuffle indices
for i = 1:length(m.covec)
@inbounds l1 = m.covec[i].i # main index
@inbounds l2 = m.covec[i].j # context index
@inbounds v = m.covec[i].v

vm[:] = m.W_main[:, l1]
vc[:] = m.W_ctx[:, l2]

diff = dot(vec(vm), vec(vc)) + m.b_main[l1] + m.b_ctx[l2] - 
log(v)
fdiff = ifelse(v  xmax, (v / xmax) ^ alpha, 1.0) * diff
J += 0.5 * fdiff * diff

fdiff *= s.lrate
# inc memory by ~200 MB  running time by 2x
grad_main[:] = fdiff * m.W_ctx[:, l2]
grad_ctx[:] = fdiff * m.W_main[:, l1]

# Adaptive learning
# inc ~ 600MB + 0.75s
#= @inbounds for ii = 1:vecsize =#
#= m.W_main[ii, l1] -= grad_main[ii] / 
sqrt(m.W_main_grad[ii, l1]) =#
#= m.W_ctx[ii, l2] -= grad_ctx[ii] / sqrt(m.W_ctx_grad[ii, 
l2]) =#
#= m.b_main[l1] -= fdiff ./ sqrt(m.b_main_grad[l1]) =#
#= m.b_ctx[l2] -= fdiff ./ sqrt(m.b_ctx_grad[l2]) =#
#= end =#

m.W_main[:, l1] -= grad_main ./ sqrt(m.W_main_grad[:, l1])
m.W_ctx[:, l2] -= grad_ctx ./ sqrt(m.W_ctx_grad[:, l2])
m.b_main[l1] -= fdiff ./ sqrt(m.b_main_grad[l1])
m.b_ctx[l2] -= fdiff ./ sqrt(m.b_ctx_grad[l2])

# Gradients
fdiff *= fdiff
m.W_main_grad[:, l1] += grad_main .^ 2
m.W_ctx_grad[:, l2] += grad_ctx .^ 2
m.b_main_grad[l1] += fdiff
m.b_ctx_grad[l2] += fdiff
end

#= if n % 10 == 0 =#
#= println(iteration $n, cost $J) =#
#= end =#
end
end


Here's the entire repo https://github.com/domluna/GloVe.jl. Might be 
helpful.

I tried doing some loops but it allocates more memory (oddly enough) and 
gets slower.

You'll notice the word vectors are indexed by column, I changed the 
representation to that
seeing if it would make a difference during the loop. It didn't seem to.

The memory analysis showed

Julia Version 0.4.0-dev+4893
Commit eb5da26* (2015-05-19 11:51 UTC)
Platform Info:
  System: Darwin (x86_64-apple-darwin14.4.0)
  CPU: Intel(R) Core(TM) i5-2557M CPU @ 1.70GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Sandybridge)
  LAPACK: libopenblas
  LIBM: libopenlibm
  LLVM: libLLVM-3.3

Here model consists of 100x19 Matrices and 100 element vectors, 19 words in 
the vocab, 100 element word vector.

@time GloVe.train!(model, GloVe.Adagrad(500))
   1.990 seconds  (6383 k allocations: 1162 MB, 10.82% gc time)

0.3 has is a bit slower due to worse gc but same memory.

Any help would be greatly appreciated!


[julia-users] Questions relating to packages and using/creating them

2014-11-10 Thread Dom Luna
I have some general questions about using packages.

1. Is there a way to create a workspace separate of $HOME/.julia? This 
would still have the same functionality when calling using in the REPL.
2. What's the best practice for packages with the same name? I don't have a 
problem related to this but I'm just curious how this is handled. I think 
via Pkg.add(...) there's only one definition of any package name, but with 
Pkg.clone(...) I could see package name collisions. Having all the packages 
under one directory doesn't seem scalable to me.

thanks


[julia-users] Gadfly plotting multiple lines with different colours

2014-10-21 Thread Dom Luna
So I have a vector x and a matrix M.

The vector x consists of the points on the x axis and the columns of M are 
the respective y-cords for each line.

I'm currently plotting all the lines using layers:

layers = Layer[]
for i=1:10
   push!(layers, layer(x=x, y=M[:,i], Geom.line)[1])
end

plot(layers)

This gives me essentially what I want except all the lines are same colour. 
What's the best way for Gadfly to uniquely colour each line?

Also if the above layering approach isn't the best way to do what I'm 
trying to do please let me know.

Thanks.


Re: [julia-users] What's with the Nothing type?

2014-05-24 Thread Dom Luna
Thanks for all the helpful messages everyone, much appreciated:)


[julia-users] What's with the Nothing type?

2014-05-23 Thread Dom Luna
I just have general curiosity about the Nothing type, is there anything one 
should particularly know about it? Is it similar to a None type that one 
would typically find in pattern matching, ex. an Option type where it can 
be either Something or None, etc.

I feel like Nothing and patterns for its use aren't well documented to this 
point.

Dom


Re: [julia-users] What's with the Nothing type?

2014-05-23 Thread Dom Luna
That cleared it up, thanks! Do functions that don't explicitly return 
anything then implicitly return Nothing?

Sorry I didn't catch the FAQ section, might it be better to have that as a 
short section in types?

Dom

On Friday, May 23, 2014 3:19:35 PM UTC-4, Stefan Karpinski wrote:

 This FAQ entry may answer the question:


 http://docs.julialang.org/en/latest/manual/faq/#nothingness-and-missing-values

 If not, maybe we can expand it or clarify whatever's not clear.


 On Fri, May 23, 2014 at 2:40 PM, Dom Luna dlun...@gmail.com javascript:
  wrote:

 I just have general curiosity about the Nothing type, is there anything 
 one should particularly know about it? Is it similar to a None type that 
 one would typically find in pattern matching, ex. an Option type where it 
 can be either Something or None, etc.

 I feel like Nothing and patterns for its use aren't well documented to 
 this point.

 Dom


  

Re: [julia-users] Re: Downloaded binary startup way slower than when compiled from github

2014-05-18 Thread Dom Luna
Just downloaded it today again to try it out and the binary has the same 
startup times as from source now. The version of darwin in the binary is 
still 12.5.0 vs 13.1.0 from source. I have no idea if that's an issue or 
not but the startup time is fine now, thanks Elliot.

Dom

On Saturday, May 17, 2014 5:09:42 PM UTC-4, Elliot Saba wrote:

 Yep, we used to do this on purpose, since we didn't have a good way of 
 restricting the optimizations used by the compiler.  Now we've got a good 
 baseline set, and the nightlies needed their configurations to be matched. 
  New binaries should be up by tonight.
 -E


 On Sat, May 17, 2014 at 11:34 AM, Tobias Knopp 
 tobias...@googlemail.comjavascript:
  wrote:

 It seems that the compiled system image is not included in the prerelease 
 binaries.

 Am Samstag, 17. Mai 2014 20:23:46 UTC+2 schrieb Dom Luna:

 I find it weird that the downloaded one has a drastically slower REPL 
 startup than when compiled from github repo.

 $ where julia
 /Applications/Julia-0.3.0-prerelease-0b05b21911.app/
 Contents/Resources/julia/bin/julia
 /usr/local/bin/julia

 I'm symlinking $HOME/julia/julia to /usr/local/bin/julia

 Here's the startup times

 Downloaded:

 time julia -e 'println(Helo)'
 5.07s user 0.10s system 98% cpu 5.250 total

 Source:

 time /usr/local/bin/julia -e 'println(Helo)'
 0.28s user 0.08s system 117% cpu 0.308 total

 The versions are 1 day old from each other.

 Downloaded:

_
_   _ _(_)_ |  A fresh approach to technical computing
   (_) | (_) (_)|  Documentation: http://docs.julialang.org
_ _   _| |_  __ _   |  Type help() to list help topics
   | | | | | | |/ _` |  |
   | | |_| | | | (_| |  |  Version 0.3.0-prerelease+3053 (2014-05-14 
 22:03 UTC)
  _/ |\__'_|_|_|\__'_|  |  Commit 0b05b21* (2 days old master)
 |__/   |  x86_64-apple-darwin12.5.0


 Source:

   _
_   _ _(_)_ |  A fresh approach to technical computing
   (_) | (_) (_)|  Documentation: http://docs.julialang.org
_ _   _| |_  __ _   |  Type help() to list help topics
   | | | | | | |/ _` |  |
   | | |_| | | | (_| |  |  Version 0.3.0-prerelease+3081 (2014-05-16 
 15:12 UTC)
  _/ |\__'_|_|_|\__'_|  |  Commit eb4bfcc (1 day old master)
 |__/   |  x86_64-apple-darwin13.1.0

 The main thing I notice is the apple-darwin12.5.0 vs apple-darwin13.1.0. 
 I'm not sure what that means. I'm on OSX 10.9.2.

 Dom




[julia-users] Downloaded binary startup way slower than when compiled from github

2014-05-17 Thread Dom Luna
I find it weird that the downloaded one has a drastically slower REPL 
startup than when compiled from github repo.

$ where julia
/Applications/Julia-0.3.0-prerelease-0b05b21911.app/Contents/Resources/julia/bin/julia
/usr/local/bin/julia

I'm symlinking $HOME/julia/julia to /usr/local/bin/julia

Here's the startup times

Downloaded:

time julia -e 'println(Helo)'
5.07s user 0.10s system 98% cpu 5.250 total

Source:

time /usr/local/bin/julia -e 'println(Helo)'
0.28s user 0.08s system 117% cpu 0.308 total

The versions are 1 day old from each other.

Downloaded:

   _
   _   _ _(_)_ |  A fresh approach to technical computing
  (_) | (_) (_)|  Documentation: http://docs.julialang.org
   _ _   _| |_  __ _   |  Type help() to list help topics
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 0.3.0-prerelease+3053 (2014-05-14 22:03 
UTC)
 _/ |\__'_|_|_|\__'_|  |  Commit 0b05b21* (2 days old master)
|__/   |  x86_64-apple-darwin12.5.0


Source:

  _
   _   _ _(_)_ |  A fresh approach to technical computing
  (_) | (_) (_)|  Documentation: http://docs.julialang.org
   _ _   _| |_  __ _   |  Type help() to list help topics
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 0.3.0-prerelease+3081 (2014-05-16 15:12 
UTC)
 _/ |\__'_|_|_|\__'_|  |  Commit eb4bfcc (1 day old master)
|__/   |  x86_64-apple-darwin13.1.0

The main thing I notice is the apple-darwin12.5.0 vs apple-darwin13.1.0. 
I'm not sure what that means. I'm on OSX 10.9.2.

Dom



Re: [julia-users] Interfaces like in Go

2014-03-27 Thread Dom Luna
Yeah abstract types seem to be the best place to implement something like 
this since, at least to my knowledge it wouldn't fundamentally break 
anything. You would still be able to define abstract types as is but you 
would also have the added power to further refine the behaviour of that 
type.

On Thursday, March 27, 2014 4:53:26 AM UTC-4, Tobias Knopp wrote:

 In my opinion it would be worth adding some syntax for defining an 
 interface for abstract types.
 It should give us nice error messages and clean way to document an 
 interface.

 This is quite similar to the C++ concepts but as it is already possible to 
 restrict the template parameter in Julia, the only missing thing is to 
 define the interface for an abstract type. 



[julia-users] Interfaces like in Go

2014-03-26 Thread Dom Luna
I really like how interfaces work in Go and I was wondering if there's a 
similar way to express this in Julia. For those who are unfamiliar with Go 
interfaces they're essentially types defined by behaviour (functions) and 
not by fields.

So for example the ReadWriter interface implements the Read and Write 
method, other defined types that have a Read and Write method can be 
considered a ReadWriter.

Dom