[julia-users] Re: [ANN] JuliaIO and FileIO
That' s a very nice idea. Having a common way to load files with different backends is very neat and very useful. Even the idea of having a file_str macro is a very Julian way to do thing, I believe. Maybe FastaIO could benefit from this model (and also other parsers for biological data). We should contact BioJulia and FastaIO guys to see what can be done. Em sábado, 4 de abril de 2015 12:41:14 UTC-3, Simon Danisch escreveu: Hi there, FileIO has the aim to make it very easy to read any arbitrary file. I hastily copied together a proof of concept by taking code from Images.jl. JuliaIO is the umbrella group, which takes IO packages with no home. If someone wrote an IO package, but doesn't have time to implement the FileIO interface, giving it to JuliaIO might be a good idea in order to keep the package usable. Concept of FileIO is described in the readme: Meta package for FileIO. Purpose is to open a file and return the respective Julia object, without doing any research on how to open the file. f = filetest.jpg # - File{:jpg}read(f) # - Imageread(filetest.obj) # - Meshread(filetest.csv) # - DataFrame So far only Images are supported and MeshIO is on the horizon. It is structured the following way: There are three levels of abstraction, first FileIO, defining the file_str macro etc, then a meta package for a certain class of file, e.g. Images or Meshes. This meta package defines the Julia datatype (e.g. Mesh, Image) and organizes the importer libraries. This is also a good place to define IO library independant tests for different file formats. Then on the last level, there are the low-level importer libraries, which do the actual IO. They're included via Mike Innes Requires https://github.com/one-more-minute/Requires.jl package, so that it doesn't introduce extra load time if not needed. This way, using FileIO without reading/writing anything should have short load times. As an implementation example please look at FileIO - ImageIO - ImageMagick. This should already work as a proof of concept. Try: using FileIO # should be very fast, thanks to Mike Innes Requires packageread(filetest.jpg) # takes a little longer as it needs to load the IO libraryread(filetest.jpg) # should be fastread(File(documents, images, myimage.jpg) # automatic joinpath via File constructor Please open issues if things are not clear or if you find flaws in the concept/implementation. If you're interested in working on this infrastructure I'll be pleased to add you to the group JuliaIO. Best, Simon
[julia-users] Re: Introducing Julia wikibook
It's not bad, but could be much better. We currently need contributors, since some of them gone work on other manuals/books sometime ago. If anyone here could help, we would be glad. Em quarta-feira, 11 de fevereiro de 2015 18:18:39 UTC-2, David P. Sanders escreveu: Just stumbled on this, which seems not bad (though I haven't looked in detail): http://en.wikibooks.org/wiki/Introducing_Julia
[julia-users] Help: K-Means Clustering Algorithm
Hi guys, I'm trying to implement the K-Means Clustering Algorithm, but I'm having some problems. The function I wrote: function kcluster(data; distance = pearson, k=4) # Generate a list of tuples of the min and max values of each column of data ranges = [(minimum(data[:,i]), maximum(data[:,i])) for i in 1:size(data, 2)] # Create k randomly placed centroids centroids = [rand()*ranges[j][2] - ranges[j][1] + ranges[j][1] for i in 1:k, j in 1:length(ranges)] lastmatches = Any[] for t in 1:100 println(Iteration $t) bestmatches = [Int[] for i in 1:k] # Get best matches for each cluster for j in 1:size(data, 1) row = data[j, :] bestmatch = 1 bestd = distance(centroids[bestmatch, :], row) for i in 1:k d = distance(centroids[i, :], row) if d bestd bestd = d bestmatch = i end end push!(bestmatches[bestmatch], j) end if lastmatches == bestmatches return lastmatches end lastmatches = bestmatches # Move clusters to the average of its matches numcols = size(data, 2) for i in 1:k avgs = zeros(1, numcols) if length(bestmatches[i]) 0 for row in bestmatches[i] avgs += data[row, :] end avgs /= length(bestmatches[i]) centroids[i, :] = avgs end end end return lastmatches end The data argument is a two dimensional Array, each row representing an individual, and each column its position on space. The problem is the following: the same algorithm in Python (with the same data input), use to stop near iteration #5, and in Julia it always goes to the iteration #100. The not-empty clusters on Python are also smaller, therefore there are less empty clusters. Can somebody find why it never enters the if lastmatches == bestmatches block? Sorry about my poor english
Re: [julia-users] Help: K-Means Clustering Algorithm
Hi John, Thanks for the tip, but actually I'm not using this function for production. I was reading the Programming Collective Intelligence, and trying to implement the examples in Julia rather than Python (with some complications as missing packages, like Beatiful Soup, but thats ok...). So, this is an exercise to help me with understanding these algorithms, and learn more Julia at the same time. The next book I'll try this is, guess what, Machine Learning for hackers! Hope that the transition of the algorithms on that book is easier. Thanks again! Em quinta-feira, 3 de julho de 2014 19h29min37s UTC-3, John Myles White escreveu: Hi Paulo, Rather than implement k-means from scratch, I'd encourage you to use the implementation in the Clustering.jl package. -- John On Jul 3, 2014, at 2:51 PM, Paulo Castro p.olivei...@gmail.com javascript: wrote: Hi guys, I'm trying to implement the K-Means Clustering Algorithm, but I'm having some problems. The function I wrote: function kcluster(data; distance = pearson, k=4) # Generate a list of tuples of the min and max values of each column of data ranges = [(minimum(data[:,i]), maximum(data[:,i])) for i in 1:size( data,2)] # Create k randomly placed centroids centroids = [rand()*ranges[j][2] - ranges[j][1] + ranges[j][1] for i in 1:k, j in 1:length(ranges)] lastmatches = Any[] for t in 1:100 println(Iteration $t) bestmatches = [Int[] for i in 1:k] # Get best matches for each cluster for j in 1:size(data, 1) row = data[j, :] bestmatch = 1 bestd = distance(centroids[bestmatch, :], row) for i in 1:k d = distance(centroids[i, :], row) if d bestd bestd = d bestmatch = i end end push!(bestmatches[bestmatch], j) end if lastmatches == bestmatches return lastmatches end lastmatches = bestmatches # Move clusters to the average of its matches numcols = size(data, 2) for i in 1:k avgs = zeros(1, numcols) if length(bestmatches[i]) 0 for row in bestmatches[i] avgs += data[row, :] end avgs /= length(bestmatches[i]) centroids[i, :] = avgs end end end return lastmatches end The data argument is a two dimensional Array, each row representing an individual, and each column its position on space. The problem is the following: the same algorithm in Python (with the same data input), use to stop near iteration #5, and in Julia it always goes to the iteration #100. The not-empty clusters on Python are also smaller, therefore there are less empty clusters. Can somebody find why it never enters the if lastmatches == bestmatches block? Sorry about my poor english
[julia-users] How can I create a simple Graph using Graphs.jl?
Hi guys, Sorry about making this kind of question, but even after reading the documentation, I don't know how to create the simplest graph object using Graphs.jl. For example, I want to create the following graph: http://i.stack.imgur.com/820Fl.png Can someone give me the directions to start?
[julia-users] DataFrames: Problems with Split-Apply-Combine strategy
*I made this question on StackOverflow, but I think I will get better results posting it here. We should use that platform more, so Julia is more exposed to R/Python/Matlab users needing something like it.* I have some data (from a R course assignment, but that doesn't matter) that I want to use split-apply-combine strategy, but I'm having some problems. The data is on a DataFrame, called outcome, and each line represents a Hospital. Each column has an information about that hospital, like name, location, rates, etc. *My objective is to obtain the Hospital with the lowest Mortality by Heart Attack Rate of each State.* I was playing around with some strategies, and got a problem using the byfunction: best_heart_rate(df) = sort(df, cols = :Mortality)[end,:] best_hospitals = by(hospitals, :State, best_heart_rate) The idea was to split the hospitals DataFrame by State, sort each of the SubDataFrames by Mortality Rate, get the lowest one, and combine the lines in a new DataFrame But when I used this strategy, I got: ERROR: no method nrow(SubDataFrame{Array{Int64,1}}) in sort at /home/paulo/.julia/v0.3/DataFrames/src/dataframe/sort.jl:311 in sort at /home/paulo/.julia/v0.3/DataFrames/src/dataframe/sort.jl:296 in f at none:1 in based_on at /home/paulo/.julia/v0.3/DataFrames/src/groupeddataframe/grouping.jl:144 in by at /home/paulo/.julia/v0.3/DataFrames/src/groupeddataframe/grouping.jl:202 I suppose the nrow function is not implemented for SubDataFrames for a good reason, so I gave up from this strategy. Then I used a nastier code: best_heart_rate(df) = (df[sortperm(df[:,:Mortality] , rev=true), :])[1,:] best_hospitals = by(hospitals, :State, best_heart_rate) Seems to work. But now there is a NA problem: how can I remove the rows from the SubDataFrames that have NA on the Mortality column? Is there a better strategy to accomplish my objective?
[julia-users] Re: JuliaCon Question Thread
Remembering to put a link to the slides on video's description would also be useful. Thanks!
[julia-users] Gadfly: plotting histogram of a integer variable x
Hi guys, I'm having a problem when plotting data like this: data = int(round(dropna(outcome[:,11]))) p = plot(x=data, Geom.histogram) Here, data is an Array{Int64,1}. But the plot I get after running this have the bars spread, ignoring the space between integers. Is this expected? How can I make the same plot, but enlarging the bars so they touch each other? https://lh6.googleusercontent.com/-5_gJWTBY4vY/U30w2QEPWyI/AGU/mfgnxTzo8uE/s1600/myplot.png
[julia-users] Problems with NA on Array{Any} using square brackets notation
Hi there, I'm using DataFrames package and having problems using NA on Array{Any}. Here is the thing: julia a = [true 2 hi] 1x3 Array{Any,2}: true 2 hi That's expected, Julia had no way to promote this elements to a single concrete type, so I got an Array{Any}. Now, if I try: julia a[3] = NA 1x3 Array{Any,2}: true 2 NA Thats also expected! NA is of NAType, that is a subtype of Any (as any other type). But if I try: julia a = [true 2 NA] ERROR: no method convert(Type{Int64}, NAtype) in setindex! at multidimensional.jl:63 in cat at abstractarray.jl:625 in hcat at abstractarray.jl:632 That's unexpected. What am I missing? Thanks and sorry about my poor English.
[julia-users] Re: JuliaCon: Registration is open!
Will the event organizators post videos on youtube for Julia users that couldn't attend? Em terça-feira, 6 de maio de 2014 16h51min03s UTC-3, Evan Miller escreveu: On behalf of the JuliaCon committee[1], I'm pleased to announce that registration is now open for JuliaCon, taking place June 26 and 27 at the University of Chicago's Gleacher Center in Chicago, IL. JuliaCon will be a two-day, single-track conference and an excuse for the Julia faithful to get together and geek out. Tickets are $110. Conference website: http://www.juliacon.org/ Registration page: http://juliacon.eventbrite.com/ We have some great speakers lined up already, and are looking for more. See the Call For Participation on the website for details. Talk proposals are due Monday, May 26. The day after the conference, Hack@UChicago will host a free Julia Hack Day at the University of Chicago's Hyde Park campus. You'll need to register for this event separately (link on website). Sponsorship opportunities are available, which will help us keep the ticket prices nice and cheap. If your organization might be interested in being a JuliaCon sponsor, please contact me directly. Mark your calendar, and get pumped! This will be a great event. More information will be posted on the website as the conference approaches. See you in June! Evan 1. Full committee: Douglas Bates, Wisconsin-Madison Jeff Bezanson, MIT Jonah Bloch-Johnson, UChicago Jiahao Chen, MIT Alan Edelman, MIT Garrett Smith, CloudBees Jeff Hammond, Argonne Tim Holy, WUSTL Stefan Karpinski, MIT Evan Miller, Wizard Hunter Owens, UChicago James Porter, UChicago Arch Robison, Intel Viral B Shah, MIT
[julia-users] Re: Mysterious setindex! error when running Runge-Kutta-Fehlberg for two coupled functions
Thanks, it worked! Em domingo, 4 de maio de 2014 19h06min47s UTC-3, Tony Kelman escreveu: On the lines where you initialize k and l to [0 0 0 0 0 0], that is an array of integers by default. Then in the for loops where you try to assign a floating-point result into that array, you get an InexactError because a general floating-point number can't be exactly represented as an integer. If you change those to float64([0 0 0 0 0 0]) or Float64[0 0 0 0 0 0] or [0.0 0.0 0.0 0.0 0.0 0.0], and likewise with t0, v0, and w0, it should work. On Sunday, May 4, 2014 2:40:17 PM UTC-7, Paulo Castro wrote: I was trying to port a function from a previous Python code of mine, and ended with this: # Rubge-Kutta-Fehlberg function RKF45(f, g, h, t0, tf, v0, w0) # Initial values t = t0 v5 = v4 = v = v0 w = w0 tolerance = 10^(-5.0) # Create list for future plots t_list = [t] v_list = [v] w_list = [w] # Values for a, b and c (as on Butcher's tableau) a = [0 0 0 000; 1/4 0 0 000; 3/32 9/32 0 000; 1932/2197 -7200/2197 7296/2197 000; 439/216 -8 3680/513 -845/410400; -8/272 -3544/2565 1859/4104 -11/40 0] b = [16/135 0 6656/12825 28561/56430 -9/50 2/55; 25/216 0 1408/2565 2197/4104 -1/5 0] c = [0 1/4 3/8 12/13 1 1/2] # Relative to y k = [0 0 0 0 0 0] # Relative to z l = [0 0 0 0 0 0] # Compute the next terms for i in t0:h:tf # Compute the next values of K and L for j in 1:6 k[j] = f(t + c[j] * h, v + (a[j,:] * k')[1], w + (a[j,:] * l' )[1])*h l[j] = g(t + c[j] * h, v + (a[j,:] * k')[1], w + (a[j,:] * l' )[1])*h end # Compute the next value of V # # Here we implemented a tolerance test # v4 = v + (b[2] * k')[1] # v5 = v + (b[1] * k')[1] # # error = abs(v5 - v4) # if error tolerance # # h = 0.9 * h * ((tolerance/error) ^ (0.25)) # # for j in 1:6 # k[j] = f(t + c[j] * h, v + (a[j] * k')[1], w + (a[j] * l' )[1])*h # l[j] = g(t + c[j] * h, v + (a[j] * k')[1], w + (a[j] * l' )[1])*h # end # # v5 = v + (b[1,:] * k')[1] # end # # v = v5 # # Compute T and W with the right values of H and L, obtained after the tolerance test t += h w += (b[1,:] * l')[1] # Append new values to the lists push!(t_list,t) push!(v_list,v) push!(w_list,w) end return t_list, v_list, w_list end After running this code, I got a mysterious error message: *ERROR: InexactError() in setindex! at array.jl:346while loading /home/paulo/Documents/Working/ex1.jl, in expression starting on line 28* ex1.jl is the file I used to test my function: include(rungekutta.jl) function f(t, v, w) return (v*(v-a)*(1-v) - w + I)/ε end function g(t, v, w) return (v - p*w - b) end ε = 0.005 a = 0.5 b = 0.15 p = 1.0 I = 0.0 h = 0.005 # Initial and final values of time t0 = 0 tf = 8 t, v, w = RKF45(f, g, h, t0, tf, 0, 0) # v0 = w0 = 0 Can someone help me with finding what's the problem? I tried a lot of things, but always end with this error. Thanks!
[julia-users] Array of images
Hi, I am starting using Julia, and I'm having a simple problem. I have some images on a directory, and I want to iterate over each one, open it with Images' imread(), and store it on an array. I cannot create a empty array and append images to it. How can I achieve this? Thanks, Paulo
[julia-users] How do I optimize a multi-argument function with Optim.jl?
Hi! I'm doing the Machine Learning course exercises with Julia. I know how to use Optim.jl when the cost function only have one argument, for example: *f(t) = someFunctionOfTheta(t)* *optimize(f, initial_theta)* But one of the exercises is to run Octave's fminuc this way: *options = optimset('GradObj', 'on', 'MaxIter', 400);[theta, cost] = fminunc(@(t)(costFunction(t, X, y)), initial_theta, options);* This piece of code compute t so that costFunction(t,X,y) is minimum. In this case, How do I do the same thing in Julia? Thanks!
[julia-users] Different error messages for sqrt(-1) and sqrt(-1.0)
Have you already noticed that error messages (on julia 0.3.0-prerelease+1419) for sqrt(-1) and sqrt(-1.0) are different? Here: julia sqrt(-1) ERROR: DomainError julia sqrt(-1.0) ERROR: DomainError sqrt will only return a complex result if called with a complex argument. try sqrt(complex(x)) in sqrt at math.jl:277 Is it a bug?