[julia-users] MongoDB and Julia

2015-07-12 Thread Kevin Liu
Hi, 

I have Julia 0.3, Mongodb-osx-x86_64-3.0.4, and Mongo-c-driver-1.1.9 
installed, but can't get Julia to access the Mongo Client through this 
'untestable' package https://github.com/pzion/Mongo.jl, according to  
http://pkg.julialang.org/. 

I have tried Lytol/Mongo.jl and the command require("Mongo.jl") can't open 
file Mongo.jl, or the auto-generated deps.jl. 

Is anyone having similar problems trying to make Julia work with Mongo? 

Thank you

Kevin


[julia-users] 4th Julia meetup in Japan: JuliaTokyo #4.

2015-07-12 Thread theremins
Hi,


On July 11th we had our 4th Julia meetup in Japan, "JuliaTokyo #4". This 
time we had 30+ perticipants.


---


JuliaTokyo #4 Presentation List in English

# Hands-on Session
by Michiaki Ariga
https://github.com/chezou/JuliaTokyoTutorial
(We tired to use JuliaBox, but failed with "Maximum number of JuliaBox 
instances active. Please try after sometime." ...)

# Main Talks
1. JuliaCon2015 Report - Sorami Hisamoto
2. Julia Summer of Code: An Interim Report - Kenta Sato
3. High-performance Streaming Analytics using Julia - Andre Pemmelaar
4. Why don't you create Spark.jl? - @sfchaos
5. Introducing QuantEcon.jl - Daisuke Oyama

# Lightning Talks
1. Material for Julia Introduction Materials - @yomichi_137
2. Characteristic Color Extraction from Images - @mrkn
3. Julia and I, sometimes Mocha - @vanquish
4. It's Time for 3D Priting with Julia - uk24s
5. Mecha-Joshi Shogi (AI Japanese Chess) - @kimrin
6. Gitter and Slack - Michiaki Ariga


---


We also had a survey on what kind of languages and softwares people use on 
a daily basis. 56 people (multiple choices allowed);

language, #people
Python, 37
R, 21
C / Julia, 14
Java, 13
C++ / Ruby, 12
Excel, 7
Perl, 5
SAS / Scala, 4
Go / JavaScript, 3
Matlab / Visual Basic / Haskell / PHP / Objective C / D, 2
Clojure / F# / C# / .Net / SQL / Apex / ECMAScript / Elixir / Swift / 
Erlang / CUDA, 1

- sorami


[julia-users] Re: Too many packages?

2015-07-12 Thread Tony Kelman
As John and Matt said, a huge portion of the standard library is written in 
Julia itself, so there's no real technical need for it to be developed 
within the same repository. In fact developing technical features as 
packages rather than as part of base allows getting features to users in a 
stable way on a time scale that the package developer has control over, 
rather than being subject to the readiness of features in the core language 
and compiler that obviously have to be done in the base repository.

> I can also find some
> Fortran/C code, and include in Julia, and have all these 
> functionality, but then what is the advantage of using Julia, as 
> opposed to, say, python?

Using Julia, your wrapper "glue code" will be shorter, simpler to 
understand, more efficient, entirely in the high-level language, and yet 
map more directly to the underlying library's interface. You just have to 
be able to point to a standard C or Fortran shared library and you can use 
that API directly, even interactively from the REPL. No need to invoke a 
separate compiler or build system at install time just for the 
language-specific binding code. Depending on the library, there will 
usually be less marshalling and worrying about cross-language data 
structure copying.

> But since Julia is a language specifically for scientific computation

That's not quite fair. Julia is a general-purpose language that happens to 
be designed to be very good at scientific computing tasks. (And most, but 
not all, of the early-adopter user and package developer community have 
come from that domain.) There are pieces included by default in the 
standard library right now that you would normally find in packages in the 
likes of NumPy or SciPy in other languages, but this may not be the case 
forever as the technical problems that have stood in the way of decoupling 
that development are gradually being solved.



[julia-users] Re: Too many packages?

2015-07-12 Thread yuuki
I think there's a big differences between developing core features in 
packages and shipping them with the default version and having optional 
third party packages implementing core features.

Personally I also find the huge amount of packages to be slightly annoying, 
but it's also clear the Julia team cannot do everything like Matlab would. 
The only thing I really would like to have 
included by default it plotting tools, because they are so essential for a 
lot of things. 


[julia-users] Re: Embedding Julia with C++

2015-07-12 Thread Kostas Tavlaridis-Gyparakis
Hello again,
I am writing to this post, as I face some more problems with trying to link 
Julia functions with C++ code.
The thing is that I changed my current Julia version, I installed Julia 
running the source code as I wanted 
to install the Cxx package as well.
So, I just went to run the first smallest example with some c code and c++ 
code that uses a julia function
as presented here , 
so I only try to run the following file:

#include 
int main(int argc, char *argv[]){
/* required: setup the julia context */
jl_init(NULL);

/* run julia commands */
jl_eval_string("print(sqrt(2.0))");

/* strongly recommended: notify julia that the program is about to 
terminate. this allows julia time to cleanup pending write requests 
and run all finalizers*/
jl_atexit_hook();
return 0;}



The first thing that was different was that when I was including just the 
path of julia.h and libjulia.so and I
run the command like this:

gcc -o test main.c -I /home/kostav/julia/src -L/home/kostav/julia/usr/lib 
-ljulia

I receive an error saying: 

*In file included from main.c:1:0: /home/kostav/julia/src/julia.h:12:24: 
fatal error: libsupport.h: No such file or directory  #include 
"libsupport.h" compilation terminated.*

Then when I add the path of libsupport.h I receive a similar error msg 
saying that uv.h is missing so I have 
to add its path as well, and as a result I finally run the following 
command:

gcc -o test main.c -I /home/kostav/julia/src -I 
/home/kostav/julia/usr/include -I /home/kostav/julia/src/support 
-L/home/kostav/julia/usr/lib -ljulia

In that case I receive various errors about undefined references regarding 
llvm functions and as a result I include
as well the equivelant library and finally run the following command:

gcc -o test main.c -I /home/kostav/julia/src -I 
/home/kostav/julia/usr/include -I /home/kostav/julia/src/support 
-L/home/kostav/julia/usr/lib -ljulia -lLLVM-3.7svn

On this case the program does compile but when I try to run it, I receive 
the following error:




*error while loading shared libraries: libjulia.so: cannot open shared 
object file: No such file or directory*The exact same implies for the 
equivalent program in c++. Note that when I try to call a julia function 
inside a c++ more complicated 
structure (namely in a class) I receive a different error. More 
specifically just for the very simple class that exists
in this Cxx example 
 
for just using the same simple julia function inside the class that leads 
to the following slighltly different cpp file

#include "ArrayMaker.h"
#include 

using namespace std;

ArrayMaker::ArrayMaker(int iNum, float fNum) {

jl_init(NULL);
jl_eval_string("print(sqrt(2.0))");

cout << "Got arguments: " << iNum << ", and " << fNum << endl;
iNumber = iNum;
fNumber = fNum;
fArr = new float[iNumber];
jl_atexit_hook();
}

float* ArrayMaker::fillArr() {
cout << "Filling the array" << endl;
for (int i=0; i < iNumber; i++) {
fArr[i] = fNumber;
fNumber *= 2;
} 
return fArr;
}

With the header file being almost the same:

#ifndef ARRAYMAKER_H
#define ARRAYMAKER_H

#include 

class ArrayMaker
{
private:
int iNumber;
float fNumber;
float* fArr;
public:
ArrayMaker(int, float);
float* fillArr();
};

#endif

When I try to compile in terminal using the following command:

g++ -o test ArrayMaker.cpp -I /home/kostav/julia/src -I 
/home/kostav/julia/usr/include -I /home/kostav/julia/src/support 
-L/home/kostav/julia/usr/lib -ljulia -lLLVM-3.7svn

I do receive the following error: 

/usr/lib/gcc/x86_64-linux-gnu/4.9/../../../x86_64-linux-gnu/crt1.o: In 
function `_start': 
/build/buildd/glibc-2.21/csu/../sysdeps/x86_64/start.S:114: undefined 
reference to `main'
collect2: error: ld returned 1 exit status



I am really not sure how to resolve any of the two errors and in general 
the second one is also pretty important to me, as in general my end goal
is starting from a julia file to use my self-written c++ code, where inside 
the c++ code I call some other self-written julia functions, I know it does
sound kind twisted, but that's what I am interested to do, as I am trying 
to tackle a big Mathematical Problem by builing the original problem in Jump
then decompose it and use a solver that is written in c++ (that's why I 
need to call c++ inside julia) which then needs again to send back 
information
to the original model and probably solve some subproblems with CPLEX 
(that's why I will need inside the c++ code to call julia functions again). 
That
with as much as few words as possible in order not to bother you with a lot 
of unnecessary information.


[julia-users] Re: 4th Julia meetup in Japan: JuliaTokyo #4.

2015-07-12 Thread Viral Shah
Please email julia...@googlegroups.com if you see such a timeout. Often it 
just means that a new machine is booting up, and things should work in a 
few minutes.

Sounds like a really fun meetup. BTW, are any of these slides in English - 
and if so, are they available anywhere?

-viral

On Sunday, July 12, 2015 at 1:16:45 PM UTC+5:30, therem...@gmail.com wrote:
>
> Hi,
>
>
> On July 11th we had our 4th Julia meetup in Japan, "JuliaTokyo #4". This 
> time we had 30+ perticipants.
>
>
> ---
>
>
> JuliaTokyo #4 Presentation List in English
>
> # Hands-on Session
> by Michiaki Ariga
> https://github.com/chezou/JuliaTokyoTutorial
> (We tired to use JuliaBox, but failed with "Maximum number of JuliaBox 
> instances active. Please try after sometime." ...)
>
> # Main Talks
> 1. JuliaCon2015 Report - Sorami Hisamoto
> 2. Julia Summer of Code: An Interim Report - Kenta Sato
> 3. High-performance Streaming Analytics using Julia - Andre Pemmelaar
> 4. Why don't you create Spark.jl? - @sfchaos
> 5. Introducing QuantEcon.jl - Daisuke Oyama
>
> # Lightning Talks
> 1. Material for Julia Introduction Materials - @yomichi_137
> 2. Characteristic Color Extraction from Images - @mrkn
> 3. Julia and I, sometimes Mocha - @vanquish
> 4. It's Time for 3D Priting with Julia - uk24s
> 5. Mecha-Joshi Shogi (AI Japanese Chess) - @kimrin
> 6. Gitter and Slack - Michiaki Ariga
>
>
> ---
>
>
> We also had a survey on what kind of languages and softwares people use on 
> a daily basis. 56 people (multiple choices allowed);
>
> language, #people
> Python, 37
> R, 21
> C / Julia, 14
> Java, 13
> C++ / Ruby, 12
> Excel, 7
> Perl, 5
> SAS / Scala, 4
> Go / JavaScript, 3
> Matlab / Visual Basic / Haskell / PHP / Objective C / D, 2
> Clojure / F# / C# / .Net / SQL / Apex / ECMAScript / Elixir / Swift / 
> Erlang / CUDA, 1
>
> - sorami
>


[julia-users] Re: Too many packages?

2015-07-12 Thread Viral Shah
It is worth differentiating what core Julia - the language and its standard 
library - includes as a default, and what different distributions of Julia 
may include to provide a good user experience. I personally have been 
wanting to make a distribution that includes a few key packages that I like 
and plotting for a while! I think with 0.4, we will be able to start doing 
this.

-viral

On Sunday, July 12, 2015 at 7:43:28 PM UTC+5:30, yu...@altern.org wrote:
>
> I think there's a big differences between developing core features in 
> packages and shipping them with the default version and having optional 
> third party packages implementing core features.
>
> Personally I also find the huge amount of packages to be slightly 
> annoying, but it's also clear the Julia team cannot do everything like 
> Matlab would. The only thing I really would like to have 
> included by default it plotting tools, because they are so essential for a 
> lot of things. 
>


Re: [julia-users] Re: Too many packages?

2015-07-12 Thread Tim Holy
As someone who remembers the days of the first packages, I also want to throw 
out the fact that the title of this issue should probably be viewed as a 
triumph :-).

--Tim

On Sunday, July 12, 2015 10:59:10 AM Viral Shah wrote:
> It is worth differentiating what core Julia - the language and its standard
> library - includes as a default, and what different distributions of Julia
> may include to provide a good user experience. I personally have been
> wanting to make a distribution that includes a few key packages that I like
> and plotting for a while! I think with 0.4, we will be able to start doing
> this.
> 
> -viral
> 
> On Sunday, July 12, 2015 at 7:43:28 PM UTC+5:30, yu...@altern.org wrote:
> > I think there's a big differences between developing core features in
> > packages and shipping them with the default version and having optional
> > third party packages implementing core features.
> > 
> > Personally I also find the huge amount of packages to be slightly
> > annoying, but it's also clear the Julia team cannot do everything like
> > Matlab would. The only thing I really would like to have
> > included by default it plotting tools, because they are so essential for a
> > lot of things.



[julia-users] Re: Too many packages?

2015-07-12 Thread Burak Budanur
Thank you! 

I apparently just made a wrong assumption that Julia was a language for 
scientific computing only. Once I think of it as a general purpose 
language, the current structure makes total sense, just as it does for 
python. 


On Sunday, July 12, 2015 at 4:47:04 AM UTC-4, Tony Kelman wrote:
>
> As John and Matt said, a huge portion of the standard library is written 
> in Julia itself, so there's no real technical need for it to be developed 
> within the same repository. In fact developing technical features as 
> packages rather than as part of base allows getting features to users in a 
> stable way on a time scale that the package developer has control over, 
> rather than being subject to the readiness of features in the core language 
> and compiler that obviously have to be done in the base repository.
>
> > I can also find some
> > Fortran/C code, and include in Julia, and have all these 
> > functionality, but then what is the advantage of using Julia, as 
> > opposed to, say, python?
>
> Using Julia, your wrapper "glue code" will be shorter, simpler to 
> understand, more efficient, entirely in the high-level language, and yet 
> map more directly to the underlying library's interface. You just have to 
> be able to point to a standard C or Fortran shared library and you can use 
> that API directly, even interactively from the REPL. No need to invoke a 
> separate compiler or build system at install time just for the 
> language-specific binding code. Depending on the library, there will 
> usually be less marshalling and worrying about cross-language data 
> structure copying.
>
> > But since Julia is a language specifically for scientific computation
>
> That's not quite fair. Julia is a general-purpose language that happens to 
> be designed to be very good at scientific computing tasks. (And most, but 
> not all, of the early-adopter user and package developer community have 
> come from that domain.) There are pieces included by default in the 
> standard library right now that you would normally find in packages in the 
> likes of NumPy or SciPy in other languages, but this may not be the case 
> forever as the technical problems that have stood in the way of decoupling 
> that development are gradually being solved.
>
>

[julia-users] eig()/eigfact() performance: Julia vs. MATLAB

2015-07-12 Thread Evgeni Bezus
Hi all,

I am a Julia novice and I am considering it as a potential alternative to 
MATLAB.
My field is computational nanophotonics and the main numerical technique 
that I use involves multiple solution of the eigenvalue/eigenvector problem 
for dense matrices with size of about 1000*1000 (more or less).
I tried to run the following nearly equivalent code in Julia and in MATLAB:

Julia code:

n = 1000
M = rand(n, n)
F = eigfact(M)
tic()
for i = 1:10
F = eigfact(M)
end
toc()


MATLAB code:

n = 1000;
M = rand(n, n);
[D, V] = eig(M);
tic;
for i = 1:10
[D, V] = eig(M);
end
toc

It turns out that MATLAB's eig() runs nearly 2.3 times faster than eig() or 
eigfact() in Julia. On the machine available to me right now (relatively 
old Core i5 laptop) the average time for MATLAB is of about 37 seconds, 
while the mean Julia time is of about 85 seconds. I use MATLAB R2010b and 
Julia 0.3.7 (i tried to run the code both in Juno and in a REPL session and 
obtained nearly identical results).

Is there anything that I'm doing wrong?

Best regards,
Evgeni


[julia-users] Re: eig()/eigfact() performance: Julia vs. MATLAB

2015-07-12 Thread John Myles White
http://julia.readthedocs.org/en/release-0.3/manual/performance-tips/

On Sunday, July 12, 2015 at 8:33:56 PM UTC+2, Evgeni Bezus wrote:
>
> Hi all,
>
> I am a Julia novice and I am considering it as a potential alternative to 
> MATLAB.
> My field is computational nanophotonics and the main numerical technique 
> that I use involves multiple solution of the eigenvalue/eigenvector problem 
> for dense matrices with size of about 1000*1000 (more or less).
> I tried to run the following nearly equivalent code in Julia and in MATLAB:
>
> Julia code:
>
> n = 1000
> M = rand(n, n)
> F = eigfact(M)
> tic()
> for i = 1:10
> F = eigfact(M)
> end
> toc()
>
>
> MATLAB code:
>
> n = 1000;
> M = rand(n, n);
> [D, V] = eig(M);
> tic;
> for i = 1:10
> [D, V] = eig(M);
> end
> toc
>
> It turns out that MATLAB's eig() runs nearly 2.3 times faster than eig() 
> or eigfact() in Julia. On the machine available to me right now (relatively 
> old Core i5 laptop) the average time for MATLAB is of about 37 seconds, 
> while the mean Julia time is of about 85 seconds. I use MATLAB R2010b and 
> Julia 0.3.7 (i tried to run the code both in Juno and in a REPL session and 
> obtained nearly identical results).
>
> Is there anything that I'm doing wrong?
>
> Best regards,
> Evgeni
>


Re: [julia-users] Re: eig()/eigfact() performance: Julia vs. MATLAB

2015-07-12 Thread Milan Bouchet-Valat
Le dimanche 12 juillet 2015 à 11:38 -0700, John Myles White a écrit :
> http://julia.readthedocs.org/en/release-0.3/manual/performance-tips/
I don't think running the code in the global scope is the problem here:
most of the computing time is probably in BLAS anyway. I think MATLAB
uses Intel MKL while Julia uses OpenBLAS, and maybe on that particular
problem and with your particular machine the former is significantly
faster.

If you really need this 2.3 factor you could try building Julia with
MKL. See https://github.com/JuliaLang/julia/#intel-compilers-and-math
-kernel-library-mkl


Regards


> On Sunday, July 12, 2015 at 8:33:56 PM UTC+2, Evgeni Bezus wrote:
> > Hi all,
> > 
> > I am a Julia novice and I am considering it as a potential 
> > alternative to MATLAB.
> > My field is computational nanophotonics and the main numerical 
> > technique that I use involves multiple solution of the 
> > eigenvalue/eigenvector problem for dense matrices with size of 
> > about 1000*1000 (more or less).
> > I tried to run the following nearly equivalent code in Julia and in 
> > MATLAB:
> > 
> > Julia code:
> > 
> > n = 1000
> > M = rand(n, n)
> > F = eigfact(M)
> > tic()
> > for i = 1:10
> > F = eigfact(M)
> > end
> > toc()
> > 
> > 
> > MATLAB code:
> > 
> > n = 1000;
> > M = rand(n, n);
> > [D, V] = eig(M);
> > tic;
> > for i = 1:10
> > [D, V] = eig(M);
> > end
> > toc
> > 
> > It turns out that MATLAB's eig() runs nearly 2.3 times faster than 
> > eig() or eigfact() in Julia. On the machine available to me right 
> > now (relatively old Core i5 laptop) the average time for MATLAB is 
> > of about 37 seconds, while the mean Julia time is of about 85 
> > seconds. I use MATLAB R2010b and Julia 0.3.7 (i tried to run the 
> > code both in Juno and in a REPL session and obtained nearly 
> > identical results).
> > 
> > Is there anything that I'm doing wrong?
> > 
> > Best regards,
> > Evgeni
> > 


[julia-users] Re: Too many packages?

2015-07-12 Thread Tony Kelman
> I think there's a big differences between developing core features in 
packages and shipping them with the default version and having optional 
third party packages implementing core features.

Like what, exactly? If the complaint is about ease of installation of 
packages, then that's a known and acknowledged bug (set of bugs) that 
people are thinking about how to do a better job of. We could always use 
more help making things better.

> The only thing I really would like to have 
> included by default it plotting tools, because they are so essential for 
a lot of things. 

I don't think you're going to find a single plotting tool that satisfies 
everyone, unfortunately. Not everyone likes grammar-of-graphics style 
plotting, or dependencies on Tk or IPython/Jupyter or OpenGL or a web 
browser to have a usable plotting package. Once we figure out the right way 
to bundle packages and make larger distribution versions of Julia we can 
include a few different choices.



[julia-users] Re: Too many packages?

2015-07-12 Thread Cedric St-Jean


On Sunday, July 12, 2015 at 4:47:42 PM UTC-4, Tony Kelman wrote:
>
> > The only thing I really would like to have 
> > included by default it plotting tools, because they are so essential for 
> a lot of things. 
>
> I don't think you're going to find a single plotting tool that satisfies 
> everyone, unfortunately. Not everyone likes grammar-of-graphics style 
> plotting, or dependencies on Tk or IPython/Jupyter or OpenGL or a web 
> browser to have a usable plotting package. Once we figure out the right way 
> to bundle packages and make larger distribution versions of Julia we can 
> include a few different choices.
>

I'm a Julia beginner who is still using pyplot to avoid making a decision. 
Choice is overwhelming when starting out with a new language, having a 
basic distribution that includes sensible defaults (eg. Enthought's scipy 
distribution) will help a lot. 


[julia-users] Re: Too many packages?

2015-07-12 Thread Tony Kelman
We can probably pick a default, and it'll probably be Gadfly, but last I 
checked Gadfly's support for running outside of IJulia is still a bit 
lacking. To distribute a working version of IJulia you have to get into the 
business of redistributing an entire Python installation, which could be 
quite a rabbit hole.


On Sunday, July 12, 2015 at 1:59:21 PM UTC-7, Cedric St-Jean wrote:
>
>
>
> On Sunday, July 12, 2015 at 4:47:42 PM UTC-4, Tony Kelman wrote:
>>
>> > The only thing I really would like to have 
>> > included by default it plotting tools, because they are so essential 
>> for a lot of things. 
>>
>> I don't think you're going to find a single plotting tool that satisfies 
>> everyone, unfortunately. Not everyone likes grammar-of-graphics style 
>> plotting, or dependencies on Tk or IPython/Jupyter or OpenGL or a web 
>> browser to have a usable plotting package. Once we figure out the right way 
>> to bundle packages and make larger distribution versions of Julia we can 
>> include a few different choices.
>>
>
> I'm a Julia beginner who is still using pyplot to avoid making a decision. 
> Choice is overwhelming when starting out with a new language, having a 
> basic distribution that includes sensible defaults (eg. Enthought's scipy 
> distribution) will help a lot. 
>


Re: [julia-users] Re: eig()/eigfact() performance: Julia vs. MATLAB

2015-07-12 Thread Mauricio Esteban Cuak
I tried in in Matlab R2014a and Julia 0.3.10 on an 2.5 GHZ i5 and the 
difference was much smaller:

22 seconds for Julia, 19 for Matlab

Also, I tried it in local and global scope and the difference wasn't more 
than one or two seconds




El domingo, 12 de julio de 2015, 13:57:40 (UTC-5), Milan Bouchet-Valat 
escribió:
>
> Le dimanche 12 juillet 2015 à 11:38 -0700, John Myles White a écrit : 
> > http://julia.readthedocs.org/en/release-0.3/manual/performance-tips/ 
> I don't think running the code in the global scope is the problem here: 
> most of the computing time is probably in BLAS anyway. I think MATLAB 
> uses Intel MKL while Julia uses OpenBLAS, and maybe on that particular 
> problem and with your particular machine the former is significantly 
> faster. 
>
> If you really need this 2.3 factor you could try building Julia with 
> MKL. See https://github.com/JuliaLang/julia/#intel-compilers-and-math 
> -kernel-library-mkl 
> 
>  
>
>
> Regards 
>
>
> > On Sunday, July 12, 2015 at 8:33:56 PM UTC+2, Evgeni Bezus wrote: 
> > > Hi all, 
> > > 
> > > I am a Julia novice and I am considering it as a potential 
> > > alternative to MATLAB. 
> > > My field is computational nanophotonics and the main numerical 
> > > technique that I use involves multiple solution of the 
> > > eigenvalue/eigenvector problem for dense matrices with size of 
> > > about 1000*1000 (more or less). 
> > > I tried to run the following nearly equivalent code in Julia and in 
> > > MATLAB: 
> > > 
> > > Julia code: 
> > > 
> > > n = 1000 
> > > M = rand(n, n) 
> > > F = eigfact(M) 
> > > tic() 
> > > for i = 1:10 
> > > F = eigfact(M) 
> > > end 
> > > toc() 
> > > 
> > > 
> > > MATLAB code: 
> > > 
> > > n = 1000; 
> > > M = rand(n, n); 
> > > [D, V] = eig(M); 
> > > tic; 
> > > for i = 1:10 
> > > [D, V] = eig(M); 
> > > end 
> > > toc 
> > > 
> > > It turns out that MATLAB's eig() runs nearly 2.3 times faster than 
> > > eig() or eigfact() in Julia. On the machine available to me right 
> > > now (relatively old Core i5 laptop) the average time for MATLAB is 
> > > of about 37 seconds, while the mean Julia time is of about 85 
> > > seconds. I use MATLAB R2010b and Julia 0.3.7 (i tried to run the 
> > > code both in Juno and in a REPL session and obtained nearly 
> > > identical results). 
> > > 
> > > Is there anything that I'm doing wrong? 
> > > 
> > > Best regards, 
> > > Evgeni 
> > > 
>


[julia-users] How to add data/json to HTTP POST with Requests.jl

2015-07-12 Thread Martin Michel

Hi there, 
I have very little experience with Julia and want to try whether it fits my 
needs. I want to connect to a Neo4j database and send a Cypher statement 
with Julia, see http://neo4j.com/docs/stable/cypher-intro-applications.html 
.
The authorization works fine, but I could not manage to add the *json *keyword 
(or *data*).
using Requests, HttpCommon, Codecs

db_key = "neo4j"
db_secret = "mypasswd"

# create authentication 
function enc_credentials(db_key::String, db_secret::String)
bearer_token_credentials = 
"$(encodeURI(db_key)):$(encodeURI(db_secret))"
return(base64(bearer_token_credentials))
end
response = post(URI("http://localhost:7474/db/data/transaction/commit";),
"grant_type=client_credentials";
headers = {"Authorization" => "Basic 
$(enc_credentials(db_key,db_secret))", "Content-Type" => "application/json"
}, 
json = {"statements"=>[{"statement"=>"CREATE (p:Person 
{name:{name},born:{born}}) RETURN p","parameters"=>{"name"=>"Keanu Reeves",
"born"=>1964}}]})

println(response)

This results in
ERROR: unrecognized keyword argument "json"


[julia-users] Re: How to add data/json to HTTP POST with Requests.jl

2015-07-12 Thread Avik Sengupta
I think the issue is you have  the second argument ("grant_type=client_
credentials") as the data for the post, and then the "json=..." argument 
also as data for the post. 

The API for post() is that it either takes the data as the second argument 
to the function, or as a keyword argument "data" or a keyword argument 
"json" . It has to be only one of those options. 

Regards
-
Avik

On Sunday, 12 July 2015 23:41:26 UTC+1, Martin Michel wrote:
>
>
> Hi there, 
> I have very little experience with Julia and want to try whether it fits 
> my needs. I want to connect to a Neo4j database and send a Cypher statement 
> with Julia, see 
> http://neo4j.com/docs/stable/cypher-intro-applications.html .
> The authorization works fine, but I could not manage to add the *json 
> *keyword 
> (or *data*).
> using Requests, HttpCommon, Codecs
>
> db_key = "neo4j"
> db_secret = "mypasswd"
>
> # create authentication 
> function enc_credentials(db_key::String, db_secret::String)
> bearer_token_credentials = 
> "$(encodeURI(db_key)):$(encodeURI(db_secret))"
> return(base64(bearer_token_credentials))
> end
> response = post(URI("http://localhost:7474/db/data/transaction/commit";),
> "grant_type=client_credentials";
> headers = {"Authorization" => "Basic 
> $(enc_credentials(db_key,db_secret))", "Content-Type" => 
> "application/json"}, 
> json = {"statements"=>[{"statement"=>"CREATE (p:Person 
> {name:{name},born:{born}}) RETURN p","parameters"=>{"name"=>"Keanu Reeves"
> ,"born"=>1964}}]})
>
> println(response)
>
> This results in
> ERROR: unrecognized keyword argument "json"
>


Re: [julia-users] Re: Too many packages?

2015-07-12 Thread Miguel Bazdresch
In terms of dependencies and size, Gaston is probably minimal; it depends
on gnuplot only, which is a small binary and readily distributed on Linux,
OS X and windows. It offers basic features only (but has 3-D plotting), but
this may be an advantage for a default package. It is also well documented
(at least, I like to think so :)

-- mb

On Sun, Jul 12, 2015 at 5:17 PM, Tony Kelman  wrote:

> We can probably pick a default, and it'll probably be Gadfly, but last I
> checked Gadfly's support for running outside of IJulia is still a bit
> lacking. To distribute a working version of IJulia you have to get into the
> business of redistributing an entire Python installation, which could be
> quite a rabbit hole.
>
>
> On Sunday, July 12, 2015 at 1:59:21 PM UTC-7, Cedric St-Jean wrote:
>>
>>
>>
>> On Sunday, July 12, 2015 at 4:47:42 PM UTC-4, Tony Kelman wrote:
>>>
>>> > The only thing I really would like to have
>>> > included by default it plotting tools, because they are so essential
>>> for a lot of things.
>>>
>>> I don't think you're going to find a single plotting tool that satisfies
>>> everyone, unfortunately. Not everyone likes grammar-of-graphics style
>>> plotting, or dependencies on Tk or IPython/Jupyter or OpenGL or a web
>>> browser to have a usable plotting package. Once we figure out the right way
>>> to bundle packages and make larger distribution versions of Julia we can
>>> include a few different choices.
>>>
>>
>> I'm a Julia beginner who is still using pyplot to avoid making a
>> decision. Choice is overwhelming when starting out with a new language,
>> having a basic distribution that includes sensible defaults (eg.
>> Enthought's scipy distribution) will help a lot.
>>
>


[julia-users] Strange performance problem for array scaling

2015-07-12 Thread Yichao Yu
Hi,

I've just seen a very strange (for me) performance difference for
exactly the same code on slightly different input with no explicit
branches.

The code is available here[1]. The most relavant part is the following
function. (All other part of the code are for initialization and bench
mark). This is a simplified version of my similation that compute the
next array column in the array based on the previous one.

The strange part is that the performance of this function can differ
by 10x depend on the value of the scaling factor (`eΓ`, the only use
of which is marked in the code below) even though I don't see any
branches that depends on that value in the relavant code. (unless the
cpu is 10x less efficient for certain input values)

function propagate(P, ψ0, ψs, eΓ)
@inbounds for i in 1:P.nele
ψs[1, i, 1] = ψ0[1, i]
ψs[2, i, 1] = ψ0[2, i]
end
T12 = im * sin(P.Ω)
T11 = cos(P.Ω)
@inbounds for i in 2:(P.nstep + 1)
for j in 1:P.nele
ψ_e = ψs[1, j, i - 1]
ψ_g = ψs[2, j, i - 1] * eΓ # < Scaling factor
ψs[2, j, i] = T11 * ψ_e + T12 * ψ_g
ψs[1, j, i] = T11 * ψ_g + T12 * ψ_e
end
end
ψs
end

The output of the full script is attached and it can be clearly seen
that for scaling factor 0.6-0.8, the performance is 5-10 times slower
than others.

The assembly[2] and llvm[3] code of this function is also in the same
repo. I see the same behavior on both 0.3 and 0.4 and with LLVM 3.3
and LLVM 3.6 on two different x86_64 machine (my laptop and a linode
VPS) (the only platform I've tried that doesn't show similar behavior
is running julia 0.4 on qemu-arm... although the performance
between different values also differ by ~30% which is bigger than
noise)

This also seems to depend on the initial value.

Has anyone seen similar problems before?

Outputs:

325.821 milliseconds (25383 allocations: 1159 KB)
307.826 milliseconds (4 allocations: 144 bytes)
0.0
 19.227 milliseconds (2 allocations: 48 bytes)
0.1
 17.291 milliseconds (2 allocations: 48 bytes)
0.2
 17.404 milliseconds (2 allocations: 48 bytes)
0.3
 19.231 milliseconds (2 allocations: 48 bytes)
0.4
 20.278 milliseconds (2 allocations: 48 bytes)
0.5
 23.692 milliseconds (2 allocations: 48 bytes)
0.6
328.107 milliseconds (2 allocations: 48 bytes)
0.7
312.425 milliseconds (2 allocations: 48 bytes)
0.8
201.494 milliseconds (2 allocations: 48 bytes)
0.9
 16.314 milliseconds (2 allocations: 48 bytes)
1.0
 16.264 milliseconds (2 allocations: 48 bytes)


[1] 
https://github.com/yuyichao/explore/blob/e4be0151df33571c1c22f54fe044c929ca821c46/julia/array_prop/array_prop.jl
[2] 
https://github.com/yuyichao/explore/blob/e4be0151df33571c1c22f54fe044c929ca821c46/julia/array_prop/propagate.S
[2] 
https://github.com/yuyichao/explore/blob/e4be0151df33571c1c22f54fe044c929ca821c46/julia/array_prop/propagate.ll


[julia-users] Re: Strange performance problem for array scaling

2015-07-12 Thread Yichao Yu
P.S. Given how strange this problem is for me, I would appreciate if
anyone can confirm either this is a real issue or I'm somehow being
crazy or stupid.



On Sun, Jul 12, 2015 at 7:30 PM, Yichao Yu  wrote:
> Hi,
>
> I've just seen a very strange (for me) performance difference for
> exactly the same code on slightly different input with no explicit
> branches.
>
> The code is available here[1]. The most relavant part is the following
> function. (All other part of the code are for initialization and bench
> mark). This is a simplified version of my similation that compute the
> next array column in the array based on the previous one.
>
> The strange part is that the performance of this function can differ
> by 10x depend on the value of the scaling factor (`eΓ`, the only use
> of which is marked in the code below) even though I don't see any
> branches that depends on that value in the relavant code. (unless the
> cpu is 10x less efficient for certain input values)
>
> function propagate(P, ψ0, ψs, eΓ)
> @inbounds for i in 1:P.nele
> ψs[1, i, 1] = ψ0[1, i]
> ψs[2, i, 1] = ψ0[2, i]
> end
> T12 = im * sin(P.Ω)
> T11 = cos(P.Ω)
> @inbounds for i in 2:(P.nstep + 1)
> for j in 1:P.nele
> ψ_e = ψs[1, j, i - 1]
> ψ_g = ψs[2, j, i - 1] * eΓ # < Scaling factor
> ψs[2, j, i] = T11 * ψ_e + T12 * ψ_g
> ψs[1, j, i] = T11 * ψ_g + T12 * ψ_e
> end
> end
> ψs
> end
>
> The output of the full script is attached and it can be clearly seen
> that for scaling factor 0.6-0.8, the performance is 5-10 times slower
> than others.
>
> The assembly[2] and llvm[3] code of this function is also in the same
> repo. I see the same behavior on both 0.3 and 0.4 and with LLVM 3.3
> and LLVM 3.6 on two different x86_64 machine (my laptop and a linode
> VPS) (the only platform I've tried that doesn't show similar behavior
> is running julia 0.4 on qemu-arm... although the performance
> between different values also differ by ~30% which is bigger than
> noise)
>
> This also seems to depend on the initial value.
>
> Has anyone seen similar problems before?
>
> Outputs:
>
> 325.821 milliseconds (25383 allocations: 1159 KB)
> 307.826 milliseconds (4 allocations: 144 bytes)
> 0.0
>  19.227 milliseconds (2 allocations: 48 bytes)
> 0.1
>  17.291 milliseconds (2 allocations: 48 bytes)
> 0.2
>  17.404 milliseconds (2 allocations: 48 bytes)
> 0.3
>  19.231 milliseconds (2 allocations: 48 bytes)
> 0.4
>  20.278 milliseconds (2 allocations: 48 bytes)
> 0.5
>  23.692 milliseconds (2 allocations: 48 bytes)
> 0.6
> 328.107 milliseconds (2 allocations: 48 bytes)
> 0.7
> 312.425 milliseconds (2 allocations: 48 bytes)
> 0.8
> 201.494 milliseconds (2 allocations: 48 bytes)
> 0.9
>  16.314 milliseconds (2 allocations: 48 bytes)
> 1.0
>  16.264 milliseconds (2 allocations: 48 bytes)
>
>
> [1] 
> https://github.com/yuyichao/explore/blob/e4be0151df33571c1c22f54fe044c929ca821c46/julia/array_prop/array_prop.jl
> [2] 
> https://github.com/yuyichao/explore/blob/e4be0151df33571c1c22f54fe044c929ca821c46/julia/array_prop/propagate.S
> [2] 
> https://github.com/yuyichao/explore/blob/e4be0151df33571c1c22f54fe044c929ca821c46/julia/array_prop/propagate.ll


[julia-users] Re: Strange performance problem for array scaling

2015-07-12 Thread Yichao Yu
On Sun, Jul 12, 2015 at 7:40 PM, Yichao Yu  wrote:
> P.S. Given how strange this problem is for me, I would appreciate if
> anyone can confirm either this is a real issue or I'm somehow being
> crazy or stupid.
>

One additional strange property of this issue is that I used to have
much costy operations in the (outer) loop (the one that iterate over
nsteps with i) like Fourier transformations. However, when the scaling
factor is taking the bad value, it slows everything down (i.e. the
Fourier transformation is also slower by ~10x).

>
>
> On Sun, Jul 12, 2015 at 7:30 PM, Yichao Yu  wrote:
>> Hi,
>>
>> I've just seen a very strange (for me) performance difference for
>> exactly the same code on slightly different input with no explicit
>> branches.
>>
>> The code is available here[1]. The most relavant part is the following
>> function. (All other part of the code are for initialization and bench
>> mark). This is a simplified version of my similation that compute the
>> next array column in the array based on the previous one.
>>
>> The strange part is that the performance of this function can differ
>> by 10x depend on the value of the scaling factor (`eΓ`, the only use
>> of which is marked in the code below) even though I don't see any
>> branches that depends on that value in the relavant code. (unless the
>> cpu is 10x less efficient for certain input values)
>>
>> function propagate(P, ψ0, ψs, eΓ)
>> @inbounds for i in 1:P.nele
>> ψs[1, i, 1] = ψ0[1, i]
>> ψs[2, i, 1] = ψ0[2, i]
>> end
>> T12 = im * sin(P.Ω)
>> T11 = cos(P.Ω)
>> @inbounds for i in 2:(P.nstep + 1)
>> for j in 1:P.nele
>> ψ_e = ψs[1, j, i - 1]
>> ψ_g = ψs[2, j, i - 1] * eΓ # < Scaling factor
>> ψs[2, j, i] = T11 * ψ_e + T12 * ψ_g
>> ψs[1, j, i] = T11 * ψ_g + T12 * ψ_e
>> end
>> end
>> ψs
>> end
>>
>> The output of the full script is attached and it can be clearly seen
>> that for scaling factor 0.6-0.8, the performance is 5-10 times slower
>> than others.
>>
>> The assembly[2] and llvm[3] code of this function is also in the same
>> repo. I see the same behavior on both 0.3 and 0.4 and with LLVM 3.3
>> and LLVM 3.6 on two different x86_64 machine (my laptop and a linode
>> VPS) (the only platform I've tried that doesn't show similar behavior
>> is running julia 0.4 on qemu-arm... although the performance
>> between different values also differ by ~30% which is bigger than
>> noise)
>>
>> This also seems to depend on the initial value.
>>
>> Has anyone seen similar problems before?
>>
>> Outputs:
>>
>> 325.821 milliseconds (25383 allocations: 1159 KB)
>> 307.826 milliseconds (4 allocations: 144 bytes)
>> 0.0
>>  19.227 milliseconds (2 allocations: 48 bytes)
>> 0.1
>>  17.291 milliseconds (2 allocations: 48 bytes)
>> 0.2
>>  17.404 milliseconds (2 allocations: 48 bytes)
>> 0.3
>>  19.231 milliseconds (2 allocations: 48 bytes)
>> 0.4
>>  20.278 milliseconds (2 allocations: 48 bytes)
>> 0.5
>>  23.692 milliseconds (2 allocations: 48 bytes)
>> 0.6
>> 328.107 milliseconds (2 allocations: 48 bytes)
>> 0.7
>> 312.425 milliseconds (2 allocations: 48 bytes)
>> 0.8
>> 201.494 milliseconds (2 allocations: 48 bytes)
>> 0.9
>>  16.314 milliseconds (2 allocations: 48 bytes)
>> 1.0
>>  16.264 milliseconds (2 allocations: 48 bytes)
>>
>>
>> [1] 
>> https://github.com/yuyichao/explore/blob/e4be0151df33571c1c22f54fe044c929ca821c46/julia/array_prop/array_prop.jl
>> [2] 
>> https://github.com/yuyichao/explore/blob/e4be0151df33571c1c22f54fe044c929ca821c46/julia/array_prop/propagate.S
>> [2] 
>> https://github.com/yuyichao/explore/blob/e4be0151df33571c1c22f54fe044c929ca821c46/julia/array_prop/propagate.ll


[julia-users] Re: eig()/eigfact() performance: Julia vs. MATLAB

2015-07-12 Thread Sheehan Olver
I remember seeing this same performance gap before.  I believe the problem 
is that OpenBLAS doesn't have the correct thread defaults. Here are other 
timings setting the threads directly:

*julia> **blas_set_num_threads(1)*


*julia> **@time eigfact(M);*

elapsed time: 1.827510669 seconds (79997048 bytes allocated, 1.88% gc time)


*julia> **blas_set_num_threads(2)*


*julia> **@time eigfact(M);*

elapsed time: 1.549618631 seconds (79997048 bytes allocated)


*julia> **blas_set_num_threads(3);@time eigfact(M);*

elapsed time: 1.498852226 seconds (79997048 bytes allocated, 2.63% gc time)


*julia> **blas_set_num_threads(4);@time eigfact(M);*

elapsed time: 2.062847561 seconds (79997048 bytes allocated)

On Monday, July 13, 2015 at 4:33:56 AM UTC+10, Evgeni Bezus wrote:
>
> Hi all,
>
> I am a Julia novice and I am considering it as a potential alternative to 
> MATLAB.
> My field is computational nanophotonics and the main numerical technique 
> that I use involves multiple solution of the eigenvalue/eigenvector problem 
> for dense matrices with size of about 1000*1000 (more or less).
> I tried to run the following nearly equivalent code in Julia and in MATLAB:
>
> Julia code:
>
> n = 1000
> M = rand(n, n)
> F = eigfact(M)
> tic()
> for i = 1:10
> F = eigfact(M)
> end
> toc()
>
>
> MATLAB code:
>
> n = 1000;
> M = rand(n, n);
> [D, V] = eig(M);
> tic;
> for i = 1:10
> [D, V] = eig(M);
> end
> toc
>
> It turns out that MATLAB's eig() runs nearly 2.3 times faster than eig() 
> or eigfact() in Julia. On the machine available to me right now (relatively 
> old Core i5 laptop) the average time for MATLAB is of about 37 seconds, 
> while the mean Julia time is of about 85 seconds. I use MATLAB R2010b and 
> Julia 0.3.7 (i tried to run the code both in Juno and in a REPL session and 
> obtained nearly identical results).
>
> Is there anything that I'm doing wrong?
>
> Best regards,
> Evgeni
>


[julia-users] Re: eig()/eigfact() performance: Julia vs. MATLAB

2015-07-12 Thread Sheehan Olver
Sorry, forgot the timing with the default number of threads.

*julia> **@time eigfact(M);*

elapsed time: 2.261110895 seconds (79997048 bytes allocated, 2.05% gc time)

On Monday, July 13, 2015 at 10:19:33 AM UTC+10, Sheehan Olver wrote:
>
> I remember seeing this same performance gap before.  I believe the problem 
> is that OpenBLAS doesn't have the correct thread defaults. Here are other 
> timings setting the threads directly:
>
> *julia> **blas_set_num_threads(1)*
>
>
> *julia> **@time eigfact(M);*
>
> elapsed time: 1.827510669 seconds (79997048 bytes allocated, 1.88% gc time)
>
>
> *julia> **blas_set_num_threads(2)*
>
>
> *julia> **@time eigfact(M);*
>
> elapsed time: 1.549618631 seconds (79997048 bytes allocated)
>
>
> *julia> **blas_set_num_threads(3);@time eigfact(M);*
>
> elapsed time: 1.498852226 seconds (79997048 bytes allocated, 2.63% gc time)
>
>
> *julia> **blas_set_num_threads(4);@time eigfact(M);*
>
> elapsed time: 2.062847561 seconds (79997048 bytes allocated)
>
> On Monday, July 13, 2015 at 4:33:56 AM UTC+10, Evgeni Bezus wrote:
>>
>> Hi all,
>>
>> I am a Julia novice and I am considering it as a potential alternative to 
>> MATLAB.
>> My field is computational nanophotonics and the main numerical technique 
>> that I use involves multiple solution of the eigenvalue/eigenvector problem 
>> for dense matrices with size of about 1000*1000 (more or less).
>> I tried to run the following nearly equivalent code in Julia and in 
>> MATLAB:
>>
>> Julia code:
>>
>> n = 1000
>> M = rand(n, n)
>> F = eigfact(M)
>> tic()
>> for i = 1:10
>> F = eigfact(M)
>> end
>> toc()
>>
>>
>> MATLAB code:
>>
>> n = 1000;
>> M = rand(n, n);
>> [D, V] = eig(M);
>> tic;
>> for i = 1:10
>> [D, V] = eig(M);
>> end
>> toc
>>
>> It turns out that MATLAB's eig() runs nearly 2.3 times faster than eig() 
>> or eigfact() in Julia. On the machine available to me right now (relatively 
>> old Core i5 laptop) the average time for MATLAB is of about 37 seconds, 
>> while the mean Julia time is of about 85 seconds. I use MATLAB R2010b and 
>> Julia 0.3.7 (i tried to run the code both in Juno and in a REPL session and 
>> obtained nearly identical results).
>>
>> Is there anything that I'm doing wrong?
>>
>> Best regards,
>> Evgeni
>>
>

[julia-users] Re: 4th Julia meetup in Japan: JuliaTokyo #4.

2015-07-12 Thread Andre P.
Some of the slides are already available here. More should be posted 
shortly.

http://juliatokyo.connpass.com/event/16570/presentation/

I few of them are in English. I noticed that more and more participates are 
presenting using English slides despite the fact that the audience is near 
100% native Japanese speakers. Which I think is pretty amazing! Also, the 
vibe at the Julia Tokyo is really great.  Lots of people helping each out 
with some really fun n' interesting conversation in the after-party.

Feel free to contact us if you are visiting Japan. We would love to have 
you!

Andre

On Monday, 13 July 2015 02:53:28 UTC+9, Viral Shah wrote:
>
> Please email juli...@googlegroups.com  if you see such a 
> timeout. Often it just means that a new machine is booting up, and things 
> should work in a few minutes.
>
> Sounds like a really fun meetup. BTW, are any of these slides in English - 
> and if so, are they available anywhere?
>
> -viral
>
> On Sunday, July 12, 2015 at 1:16:45 PM UTC+5:30, ther...@gmail.com 
>  wrote:
>>
>> Hi,
>>
>>
>> On July 11th we had our 4th Julia meetup in Japan, "JuliaTokyo #4". This 
>> time we had 30+ perticipants.
>>
>>
>> ---
>>
>>
>> JuliaTokyo #4 Presentation List in English
>>
>> # Hands-on Session
>> by Michiaki Ariga
>> https://github.com/chezou/JuliaTokyoTutorial
>> (We tired to use JuliaBox, but failed with "Maximum number of JuliaBox 
>> instances active. Please try after sometime." ...)
>>
>> # Main Talks
>> 1. JuliaCon2015 Report - Sorami Hisamoto
>> 2. Julia Summer of Code: An Interim Report - Kenta Sato
>> 3. High-performance Streaming Analytics using Julia - Andre Pemmelaar
>> 4. Why don't you create Spark.jl? - @sfchaos
>> 5. Introducing QuantEcon.jl - Daisuke Oyama
>>
>> # Lightning Talks
>> 1. Material for Julia Introduction Materials - @yomichi_137
>> 2. Characteristic Color Extraction from Images - @mrkn
>> 3. Julia and I, sometimes Mocha - @vanquish
>> 4. It's Time for 3D Priting with Julia - uk24s
>> 5. Mecha-Joshi Shogi (AI Japanese Chess) - @kimrin
>> 6. Gitter and Slack - Michiaki Ariga
>>
>>
>> ---
>>
>>
>> We also had a survey on what kind of languages and softwares people use 
>> on a daily basis. 56 people (multiple choices allowed);
>>
>> language, #people
>> Python, 37
>> R, 21
>> C / Julia, 14
>> Java, 13
>> C++ / Ruby, 12
>> Excel, 7
>> Perl, 5
>> SAS / Scala, 4
>> Go / JavaScript, 3
>> Matlab / Visual Basic / Haskell / PHP / Objective C / D, 2
>> Clojure / F# / C# / .Net / SQL / Apex / ECMAScript / Elixir / Swift / 
>> Erlang / CUDA, 1
>>
>> - sorami
>>
>

[julia-users] Re: Strange performance problem for array scaling

2015-07-12 Thread Kevin Owens
I can't really help you debug the IR code, but I can at least say I'm 
seeing a similar thing. It starts to slow down around just after 0.5, and 
doesn't get back to where it was at 0.5 until 0.87. Can you compare the IR 
code when two different values are used, to see what's different? When I 
tried looking at the difference between 0.50 and 0.51, the biggest thing 
that popped out to me was that the numbers after "!dbg" were different.

Even 0.50001 is a lot slower:

julia> for eΓ in 0.5:0.1:0.50015
   println(eΓ)
   gc()
   @time ψs = propagate(P, ψ0, ψs, eΓ)
   end
0.5
elapsed time: 0.065609581 seconds (16 bytes allocated)
0.50001
elapsed time: 0.875806461 seconds (16 bytes allocated)


julia> versioninfo()
Julia Version 0.3.9
Commit 31efe69 (2015-05-30 11:24 UTC)
Platform Info:
  System: Darwin (x86_64-apple-darwin13.4.0)
  CPU: Intel(R) Core(TM)2 Duo CPU T8300  @ 2.40GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Penryn)
  LAPACK: libopenblas
  LIBM: libopenlibm
  LLVM: libLLVM-3.3




On Sunday, July 12, 2015 at 6:45:53 PM UTC-5, Yichao Yu wrote:
>
> On Sun, Jul 12, 2015 at 7:40 PM, Yichao Yu > 
> wrote: 
> > P.S. Given how strange this problem is for me, I would appreciate if 
> > anyone can confirm either this is a real issue or I'm somehow being 
> > crazy or stupid. 
> > 
>
> One additional strange property of this issue is that I used to have 
> much costy operations in the (outer) loop (the one that iterate over 
> nsteps with i) like Fourier transformations. However, when the scaling 
> factor is taking the bad value, it slows everything down (i.e. the 
> Fourier transformation is also slower by ~10x). 
>
> > 
> > 
> > On Sun, Jul 12, 2015 at 7:30 PM, Yichao Yu  > wrote: 
> >> Hi, 
> >> 
> >> I've just seen a very strange (for me) performance difference for 
> >> exactly the same code on slightly different input with no explicit 
> >> branches. 
> >> 
> >> The code is available here[1]. The most relavant part is the following 
> >> function. (All other part of the code are for initialization and bench 
> >> mark). This is a simplified version of my similation that compute the 
> >> next array column in the array based on the previous one. 
> >> 
> >> The strange part is that the performance of this function can differ 
> >> by 10x depend on the value of the scaling factor (`eΓ`, the only use 
> >> of which is marked in the code below) even though I don't see any 
> >> branches that depends on that value in the relavant code. (unless the 
> >> cpu is 10x less efficient for certain input values) 
> >> 
> >> function propagate(P, ψ0, ψs, eΓ) 
> >> @inbounds for i in 1:P.nele 
> >> ψs[1, i, 1] = ψ0[1, i] 
> >> ψs[2, i, 1] = ψ0[2, i] 
> >> end 
> >> T12 = im * sin(P.Ω) 
> >> T11 = cos(P.Ω) 
> >> @inbounds for i in 2:(P.nstep + 1) 
> >> for j in 1:P.nele 
> >> ψ_e = ψs[1, j, i - 1] 
> >> ψ_g = ψs[2, j, i - 1] * eΓ # < Scaling factor 
> >> ψs[2, j, i] = T11 * ψ_e + T12 * ψ_g 
> >> ψs[1, j, i] = T11 * ψ_g + T12 * ψ_e 
> >> end 
> >> end 
> >> ψs 
> >> end 
> >> 
> >> The output of the full script is attached and it can be clearly seen 
> >> that for scaling factor 0.6-0.8, the performance is 5-10 times slower 
> >> than others. 
> >> 
> >> The assembly[2] and llvm[3] code of this function is also in the same 
> >> repo. I see the same behavior on both 0.3 and 0.4 and with LLVM 3.3 
> >> and LLVM 3.6 on two different x86_64 machine (my laptop and a linode 
> >> VPS) (the only platform I've tried that doesn't show similar behavior 
> >> is running julia 0.4 on qemu-arm... although the performance 
> >> between different values also differ by ~30% which is bigger than 
> >> noise) 
> >> 
> >> This also seems to depend on the initial value. 
> >> 
> >> Has anyone seen similar problems before? 
> >> 
> >> Outputs: 
> >> 
> >> 325.821 milliseconds (25383 allocations: 1159 KB) 
> >> 307.826 milliseconds (4 allocations: 144 bytes) 
> >> 0.0 
> >>  19.227 milliseconds (2 allocations: 48 bytes) 
> >> 0.1 
> >>  17.291 milliseconds (2 allocations: 48 bytes) 
> >> 0.2 
> >>  17.404 milliseconds (2 allocations: 48 bytes) 
> >> 0.3 
> >>  19.231 milliseconds (2 allocations: 48 bytes) 
> >> 0.4 
> >>  20.278 milliseconds (2 allocations: 48 bytes) 
> >> 0.5 
> >>  23.692 milliseconds (2 allocations: 48 bytes) 
> >> 0.6 
> >> 328.107 milliseconds (2 allocations: 48 bytes) 
> >> 0.7 
> >> 312.425 milliseconds (2 allocations: 48 bytes) 
> >> 0.8 
> >> 201.494 milliseconds (2 allocations: 48 bytes) 
> >> 0.9 
> >>  16.314 milliseconds (2 allocations: 48 bytes) 
> >> 1.0 
> >>  16.264 milliseconds (2 allocations: 48 bytes) 
> >> 
> >> 
> >> [1] 
> https://github.com/yuyichao/explore/blob/e4be0151df33571c1c22f54fe044c929ca821c46/julia/array_prop/array_prop.jl
>  
> >> [2] 
> https://github.com/yuyichao/explore/blob/e4be0151df33571c1c22f54fe04

Re: [julia-users] Re: Strange performance problem for array scaling

2015-07-12 Thread Yichao Yu
On Sun, Jul 12, 2015 at 8:31 PM, Kevin Owens  wrote:
> I can't really help you debug the IR code, but I can at least say I'm seeing
> a similar thing. It starts to slow down around just after 0.5, and doesn't

Thanks for the confirmation. At least I'm not crazy (or not the only
one to be more precise :P )

> get back to where it was at 0.5 until 0.87. Can you compare the IR code when
> two different values are used, to see what's different? When I tried looking
> at the difference between 0.50 and 0.51, the biggest thing that popped out
> to me was that the numbers after "!dbg" were different.

This is exactly the strange part. I don't think either llvm or julia
is doing constant propagation here and different input values should
be using the same function. Evidents are

1. Julia only specialize on types not values (for now)
2. The function is not inlined. Which has to be the case in global
scope and can be double checked by adding `@noinline`. It's also too
big to be inline_worthy
3. No codegen is happening except the first call. This can be seen
from the output of `@time`. Only the first call has thousands of
allocations. Following call has less than 5 (on 0.4 at least). If
julia can compile a function with less than 5 allocation, I'll be very
happy.
4. My original version also has much more complicated logic to compute
that scaling factor (ok. it's just a function call, but with
parameters gathered from different arguments) I'd be really surprised
if either llvm or julia can reason anything about it

The difference in the debug info should just be an artifact of
emitting it twice.

>
> Even 0.50001 is a lot slower:
>
> julia> for eΓ in 0.5:0.1:0.50015
>println(eΓ)
>gc()
>@time ψs = propagate(P, ψ0, ψs, eΓ)
>end
> 0.5
> elapsed time: 0.065609581 seconds (16 bytes allocated)
> 0.50001
> elapsed time: 0.875806461 seconds (16 bytes allocated)

This is actually interesting and I can confirm the same here. `0.5`
takes 28ms while 0.5001 takes 320ms. I was thinking
whether the cpu is doing sth special depending on the bit pattern but
I still don't understand why it would be very bad for a certain range.
(and is not only a function of this either, also affected by all other
values)

>
>
> julia> versioninfo()
> Julia Version 0.3.9
> Commit 31efe69 (2015-05-30 11:24 UTC)
> Platform Info:
>   System: Darwin (x86_64-apple-darwin13.4.0)

Good. so it's not Linux only.

>   CPU: Intel(R) Core(TM)2 Duo CPU T8300  @ 2.40GHz
>   WORD_SIZE: 64
>   BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Penryn)
>   LAPACK: libopenblas
>   LIBM: libopenlibm
>   LLVM: libLLVM-3.3
>
>
>
>
> On Sunday, July 12, 2015 at 6:45:53 PM UTC-5, Yichao Yu wrote:
>>
>> On Sun, Jul 12, 2015 at 7:40 PM, Yichao Yu  wrote:
>> > P.S. Given how strange this problem is for me, I would appreciate if
>> > anyone can confirm either this is a real issue or I'm somehow being
>> > crazy or stupid.
>> >
>>
>> One additional strange property of this issue is that I used to have
>> much costy operations in the (outer) loop (the one that iterate over
>> nsteps with i) like Fourier transformations. However, when the scaling
>> factor is taking the bad value, it slows everything down (i.e. the
>> Fourier transformation is also slower by ~10x).
>>
>> >
>> >
>> > On Sun, Jul 12, 2015 at 7:30 PM, Yichao Yu  wrote:
>> >> Hi,
>> >>
>> >> I've just seen a very strange (for me) performance difference for
>> >> exactly the same code on slightly different input with no explicit
>> >> branches.
>> >>
>> >> The code is available here[1]. The most relavant part is the following
>> >> function. (All other part of the code are for initialization and bench
>> >> mark). This is a simplified version of my similation that compute the
>> >> next array column in the array based on the previous one.
>> >>
>> >> The strange part is that the performance of this function can differ
>> >> by 10x depend on the value of the scaling factor (`eΓ`, the only use
>> >> of which is marked in the code below) even though I don't see any
>> >> branches that depends on that value in the relavant code. (unless the
>> >> cpu is 10x less efficient for certain input values)
>> >>
>> >> function propagate(P, ψ0, ψs, eΓ)
>> >> @inbounds for i in 1:P.nele
>> >> ψs[1, i, 1] = ψ0[1, i]
>> >> ψs[2, i, 1] = ψ0[2, i]
>> >> end
>> >> T12 = im * sin(P.Ω)
>> >> T11 = cos(P.Ω)
>> >> @inbounds for i in 2:(P.nstep + 1)
>> >> for j in 1:P.nele
>> >> ψ_e = ψs[1, j, i - 1]
>> >> ψ_g = ψs[2, j, i - 1] * eΓ # < Scaling factor
>> >> ψs[2, j, i] = T11 * ψ_e + T12 * ψ_g
>> >> ψs[1, j, i] = T11 * ψ_g + T12 * ψ_e
>> >> end
>> >> end
>> >> ψs
>> >> end
>> >>
>> >> The output of the full script is attached and it can be clearly seen
>> >> that for scaling factor 0.6-0.8, the performance is 5-10 times slower
>> >> than others.
>> >>
>> >> Th

Re: [julia-users] Re: Strange performance problem for array scaling

2015-07-12 Thread Yichao Yu
Update:

I've just got an even simpler version without any complex numbers and
only has Float64. The two loops are as small as the following LLVM-IR
now and there's only simple arithmetics in the loop body.

```llvm
L9.preheader: ; preds = %L12,
%L9.preheader.preheader
  %"#s3.0" = phi i64 [ %60, %L12 ], [ 1, %L9.preheader.preheader ]
  br label %L9

L9:   ; preds = %L9, %L9.preheader
  %"#s4.0" = phi i64 [ %44, %L9 ], [ 1, %L9.preheader ]
  %44 = add i64 %"#s4.0", 1
  %45 = add i64 %"#s4.0", -1
  %46 = mul i64 %45, %10
  %47 = getelementptr double* %7, i64 %46
  %48 = load double* %47, align 8
  %49 = add i64 %46, 1
  %50 = getelementptr double* %7, i64 %49
  %51 = load double* %50, align 8
  %52 = fmul double %51, %3
  %53 = fmul double %38, %48
  %54 = fmul double %33, %52
  %55 = fadd double %53, %54
  store double %55, double* %50, align 8
  %56 = fmul double %38, %52
  %57 = fmul double %33, %48
  %58 = fsub double %56, %57
  store double %58, double* %47, align 8
  %59 = icmp eq i64 %"#s4.0", %12
  br i1 %59, label %L12, label %L9

L12:  ; preds = %L9
  %60 = add i64 %"#s3.0", 1
  %61 = icmp eq i64 %"#s3.0", %42
  br i1 %61, label %L14.loopexit, label %L9.preheader
```

On Sun, Jul 12, 2015 at 9:01 PM, Yichao Yu  wrote:
> On Sun, Jul 12, 2015 at 8:31 PM, Kevin Owens  wrote:
>> I can't really help you debug the IR code, but I can at least say I'm seeing
>> a similar thing. It starts to slow down around just after 0.5, and doesn't
>
> Thanks for the confirmation. At least I'm not crazy (or not the only
> one to be more precise :P )
>
>> get back to where it was at 0.5 until 0.87. Can you compare the IR code when
>> two different values are used, to see what's different? When I tried looking
>> at the difference between 0.50 and 0.51, the biggest thing that popped out
>> to me was that the numbers after "!dbg" were different.
>
> This is exactly the strange part. I don't think either llvm or julia
> is doing constant propagation here and different input values should
> be using the same function. Evidents are
>
> 1. Julia only specialize on types not values (for now)
> 2. The function is not inlined. Which has to be the case in global
> scope and can be double checked by adding `@noinline`. It's also too
> big to be inline_worthy
> 3. No codegen is happening except the first call. This can be seen
> from the output of `@time`. Only the first call has thousands of
> allocations. Following call has less than 5 (on 0.4 at least). If
> julia can compile a function with less than 5 allocation, I'll be very
> happy.
> 4. My original version also has much more complicated logic to compute
> that scaling factor (ok. it's just a function call, but with
> parameters gathered from different arguments) I'd be really surprised
> if either llvm or julia can reason anything about it
>
> The difference in the debug info should just be an artifact of
> emitting it twice.
>
>>
>> Even 0.50001 is a lot slower:
>>
>> julia> for eΓ in 0.5:0.1:0.50015
>>println(eΓ)
>>gc()
>>@time ψs = propagate(P, ψ0, ψs, eΓ)
>>end
>> 0.5
>> elapsed time: 0.065609581 seconds (16 bytes allocated)
>> 0.50001
>> elapsed time: 0.875806461 seconds (16 bytes allocated)
>
> This is actually interesting and I can confirm the same here. `0.5`
> takes 28ms while 0.5001 takes 320ms. I was thinking
> whether the cpu is doing sth special depending on the bit pattern but
> I still don't understand why it would be very bad for a certain range.
> (and is not only a function of this either, also affected by all other
> values)
>
>>
>>
>> julia> versioninfo()
>> Julia Version 0.3.9
>> Commit 31efe69 (2015-05-30 11:24 UTC)
>> Platform Info:
>>   System: Darwin (x86_64-apple-darwin13.4.0)
>
> Good. so it's not Linux only.
>
>>   CPU: Intel(R) Core(TM)2 Duo CPU T8300  @ 2.40GHz
>>   WORD_SIZE: 64
>>   BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Penryn)
>>   LAPACK: libopenblas
>>   LIBM: libopenlibm
>>   LLVM: libLLVM-3.3
>>
>>
>>
>>
>> On Sunday, July 12, 2015 at 6:45:53 PM UTC-5, Yichao Yu wrote:
>>>
>>> On Sun, Jul 12, 2015 at 7:40 PM, Yichao Yu  wrote:
>>> > P.S. Given how strange this problem is for me, I would appreciate if
>>> > anyone can confirm either this is a real issue or I'm somehow being
>>> > crazy or stupid.
>>> >
>>>
>>> One additional strange property of this issue is that I used to have
>>> much costy operations in the (outer) loop (the one that iterate over
>>> nsteps with i) like Fourier transformations. However, when the scaling
>>> factor is taking the bad value, it slows everything down (i.e. the
>>> Fourier transformation is also slower by ~10x).
>>>
>>> >
>>> >
>>> > On Sun, Jul 12, 2015 at 7:30 PM, Yichao Yu  wrote:
>>> >> Hi,
>>> >>
>>> >> I've just seen a very strange (for me) performance difference for
>>> >> exactly the same 

Re: [julia-users] Re: Strange performance problem for array scaling

2015-07-12 Thread Yichao Yu
Further update:

I made a c++ version[1] and see a similar effect (depending on
optimization levels) so it's not a julia issue (not that I think it
really was to begin with...).

The slow down is presented in the c++ version for all optimization
levels except Ofast and ffast-math. The julia version is faster than
the default performance for both gcc and clang but is slower in the
fast case for higher optmization levels. For O2 and higher, the c++
version shows a ~100x slow down for the slow case.

@fast_math in julia doesn't seem to have an effect for this although
it does for clang and gcc...

[1] 
https://github.com/yuyichao/explore/blob/5a644cd46dc6f8056cee69f508f9e995b5839a01/julia/array_prop/propagate.cpp

On Sun, Jul 12, 2015 at 9:23 PM, Yichao Yu  wrote:
> Update:
>
> I've just got an even simpler version without any complex numbers and
> only has Float64. The two loops are as small as the following LLVM-IR
> now and there's only simple arithmetics in the loop body.
>
> ```llvm
> L9.preheader: ; preds = %L12,
> %L9.preheader.preheader
>   %"#s3.0" = phi i64 [ %60, %L12 ], [ 1, %L9.preheader.preheader ]
>   br label %L9
>
> L9:   ; preds = %L9, %L9.preheader
>   %"#s4.0" = phi i64 [ %44, %L9 ], [ 1, %L9.preheader ]
>   %44 = add i64 %"#s4.0", 1
>   %45 = add i64 %"#s4.0", -1
>   %46 = mul i64 %45, %10
>   %47 = getelementptr double* %7, i64 %46
>   %48 = load double* %47, align 8
>   %49 = add i64 %46, 1
>   %50 = getelementptr double* %7, i64 %49
>   %51 = load double* %50, align 8
>   %52 = fmul double %51, %3
>   %53 = fmul double %38, %48
>   %54 = fmul double %33, %52
>   %55 = fadd double %53, %54
>   store double %55, double* %50, align 8
>   %56 = fmul double %38, %52
>   %57 = fmul double %33, %48
>   %58 = fsub double %56, %57
>   store double %58, double* %47, align 8
>   %59 = icmp eq i64 %"#s4.0", %12
>   br i1 %59, label %L12, label %L9
>
> L12:  ; preds = %L9
>   %60 = add i64 %"#s3.0", 1
>   %61 = icmp eq i64 %"#s3.0", %42
>   br i1 %61, label %L14.loopexit, label %L9.preheader
> ```
>
> On Sun, Jul 12, 2015 at 9:01 PM, Yichao Yu  wrote:
>> On Sun, Jul 12, 2015 at 8:31 PM, Kevin Owens  wrote:
>>> I can't really help you debug the IR code, but I can at least say I'm seeing
>>> a similar thing. It starts to slow down around just after 0.5, and doesn't
>>
>> Thanks for the confirmation. At least I'm not crazy (or not the only
>> one to be more precise :P )
>>
>>> get back to where it was at 0.5 until 0.87. Can you compare the IR code when
>>> two different values are used, to see what's different? When I tried looking
>>> at the difference between 0.50 and 0.51, the biggest thing that popped out
>>> to me was that the numbers after "!dbg" were different.
>>
>> This is exactly the strange part. I don't think either llvm or julia
>> is doing constant propagation here and different input values should
>> be using the same function. Evidents are
>>
>> 1. Julia only specialize on types not values (for now)
>> 2. The function is not inlined. Which has to be the case in global
>> scope and can be double checked by adding `@noinline`. It's also too
>> big to be inline_worthy
>> 3. No codegen is happening except the first call. This can be seen
>> from the output of `@time`. Only the first call has thousands of
>> allocations. Following call has less than 5 (on 0.4 at least). If
>> julia can compile a function with less than 5 allocation, I'll be very
>> happy.
>> 4. My original version also has much more complicated logic to compute
>> that scaling factor (ok. it's just a function call, but with
>> parameters gathered from different arguments) I'd be really surprised
>> if either llvm or julia can reason anything about it
>>
>> The difference in the debug info should just be an artifact of
>> emitting it twice.
>>
>>>
>>> Even 0.50001 is a lot slower:
>>>
>>> julia> for eΓ in 0.5:0.1:0.50015
>>>println(eΓ)
>>>gc()
>>>@time ψs = propagate(P, ψ0, ψs, eΓ)
>>>end
>>> 0.5
>>> elapsed time: 0.065609581 seconds (16 bytes allocated)
>>> 0.50001
>>> elapsed time: 0.875806461 seconds (16 bytes allocated)
>>
>> This is actually interesting and I can confirm the same here. `0.5`
>> takes 28ms while 0.5001 takes 320ms. I was thinking
>> whether the cpu is doing sth special depending on the bit pattern but
>> I still don't understand why it would be very bad for a certain range.
>> (and is not only a function of this either, also affected by all other
>> values)
>>
>>>
>>>
>>> julia> versioninfo()
>>> Julia Version 0.3.9
>>> Commit 31efe69 (2015-05-30 11:24 UTC)
>>> Platform Info:
>>>   System: Darwin (x86_64-apple-darwin13.4.0)
>>
>> Good. so it's not Linux only.
>>
>>>   CPU: Intel(R) Core(TM)2 Duo CPU T8300  @ 2.40GHz
>>>   WORD_SIZE: 64
>>>   BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Penryn)
>>>   L

Re: [julia-users] Re: Strange performance problem for array scaling

2015-07-12 Thread Yichao Yu
On Sun, Jul 12, 2015 at 10:30 PM, Yichao Yu  wrote:
> Further update:
>
> I made a c++ version[1] and see a similar effect (depending on
> optimization levels) so it's not a julia issue (not that I think it
> really was to begin with...).

After investigating the c++ version more, I find that the difference
between the fast_math and the non-fast_math version is that the
compiler emit a function called `set_fast_math` (see below).

>From what I can tell, the function sets bit 6 and bit 15 on the MXCSR
register (for SSE) and according to this page[1] these are DAZ and FZ
bits (both related to underflow). It also describe denormals as "take
considerably longer to process". Since the operation I have keeps
decreasing the value, I guess it makes sense that there's a value
dependent performance (and it kind of make sense that fft also
suffers from these values)

So now the question is:

1. How important are underflow and denormal values? Note that I'm not
catching underflow explicitly anyway and I don't really care about
values that are really small compare to 1.

2. Is there a way to set up the SSE registers as done by the c
compilers? @fastmath does not seems to be doing this.

05b0 :
 5b0:0f ae 5c 24 fc   stmxcsr -0x4(%rsp)
 5b5:81 4c 24 fc 40 80 00 orl$0x8040,-0x4(%rsp)
 5bc:00
 5bd:0f ae 54 24 fc   ldmxcsr -0x4(%rsp)
 5c2:c3   retq
 5c3:66 2e 0f 1f 84 00 00 nopw   %cs:0x0(%rax,%rax,1)
 5ca:00 00 00
 5cd:0f 1f 00 nopl   (%rax)

[1] http://softpixel.com/~cwright/programming/simd/sse.php

>
> The slow down is presented in the c++ version for all optimization
> levels except Ofast and ffast-math. The julia version is faster than
> the default performance for both gcc and clang but is slower in the
> fast case for higher optmization levels. For O2 and higher, the c++

The slowness of the julia version seems to be due to multi dimentional
arrays. Using 1d array yields similar performance with C.

> version shows a ~100x slow down for the slow case.
>
> @fast_math in julia doesn't seem to have an effect for this although
> it does for clang and gcc...
>
> [1] 
> https://github.com/yuyichao/explore/blob/5a644cd46dc6f8056cee69f508f9e995b5839a01/julia/array_prop/propagate.cpp
>
> On Sun, Jul 12, 2015 at 9:23 PM, Yichao Yu  wrote:
>> Update:
>>
>> I've just got an even simpler version without any complex numbers and
>> only has Float64. The two loops are as small as the following LLVM-IR
>> now and there's only simple arithmetics in the loop body.
>>
>> ```llvm
>> L9.preheader: ; preds = %L12,
>> %L9.preheader.preheader
>>   %"#s3.0" = phi i64 [ %60, %L12 ], [ 1, %L9.preheader.preheader ]
>>   br label %L9
>>
>> L9:   ; preds = %L9, 
>> %L9.preheader
>>   %"#s4.0" = phi i64 [ %44, %L9 ], [ 1, %L9.preheader ]
>>   %44 = add i64 %"#s4.0", 1
>>   %45 = add i64 %"#s4.0", -1
>>   %46 = mul i64 %45, %10
>>   %47 = getelementptr double* %7, i64 %46
>>   %48 = load double* %47, align 8
>>   %49 = add i64 %46, 1
>>   %50 = getelementptr double* %7, i64 %49
>>   %51 = load double* %50, align 8
>>   %52 = fmul double %51, %3
>>   %53 = fmul double %38, %48
>>   %54 = fmul double %33, %52
>>   %55 = fadd double %53, %54
>>   store double %55, double* %50, align 8
>>   %56 = fmul double %38, %52
>>   %57 = fmul double %33, %48
>>   %58 = fsub double %56, %57
>>   store double %58, double* %47, align 8
>>   %59 = icmp eq i64 %"#s4.0", %12
>>   br i1 %59, label %L12, label %L9
>>
>> L12:  ; preds = %L9
>>   %60 = add i64 %"#s3.0", 1
>>   %61 = icmp eq i64 %"#s3.0", %42
>>   br i1 %61, label %L14.loopexit, label %L9.preheader
>> ```
>>
>> On Sun, Jul 12, 2015 at 9:01 PM, Yichao Yu  wrote:
>>> On Sun, Jul 12, 2015 at 8:31 PM, Kevin Owens  
>>> wrote:
 I can't really help you debug the IR code, but I can at least say I'm 
 seeing
 a similar thing. It starts to slow down around just after 0.5, and doesn't
>>>
>>> Thanks for the confirmation. At least I'm not crazy (or not the only
>>> one to be more precise :P )
>>>
 get back to where it was at 0.5 until 0.87. Can you compare the IR code 
 when
 two different values are used, to see what's different? When I tried 
 looking
 at the difference between 0.50 and 0.51, the biggest thing that popped out
 to me was that the numbers after "!dbg" were different.
>>>
>>> This is exactly the strange part. I don't think either llvm or julia
>>> is doing constant propagation here and different input values should
>>> be using the same function. Evidents are
>>>
>>> 1. Julia only specialize on types not values (for now)
>>> 2. The function is not inlined. Which has to be the case in global
>>> scope and can be double checked by adding `@noinline`. It's also too
>>> big to be inline_worthy
>>> 3. No codegen is happening except the fir

Re: [julia-users] Re: Strange performance problem for array scaling

2015-07-12 Thread Jeffrey Sarnoff
Denormals were made part of the IEEE Floating Point standard after some 
very careful numerical analysis showed that accomodating them would 
substantively improve the quality of floating point results and this would 
lift the quality of all floating point work. Surprising it may be, 
nonetheless you (and if not you today, you tomorrow and one of your 
neighbors today) really do care about those unusual, and often rarely 
observed values.

fyi
William Kahan on the introduction of denormals to the standard 

and an early, important paper on this
Effects of Underflow on Solving Linear Systems - J.Demmel 1981 

  

On Monday, July 13, 2015 at 12:35:24 AM UTC-4, Yichao Yu wrote:
>
> On Sun, Jul 12, 2015 at 10:30 PM, Yichao Yu  > wrote: 
> > Further update: 
> > 
> > I made a c++ version[1] and see a similar effect (depending on 
> > optimization levels) so it's not a julia issue (not that I think it 
> > really was to begin with...). 
>
> After investigating the c++ version more, I find that the difference 
> between the fast_math and the non-fast_math version is that the 
> compiler emit a function called `set_fast_math` (see below). 
>
> From what I can tell, the function sets bit 6 and bit 15 on the MXCSR 
> register (for SSE) and according to this page[1] these are DAZ and FZ 
> bits (both related to underflow). It also describe denormals as "take 
> considerably longer to process". Since the operation I have keeps 
> decreasing the value, I guess it makes sense that there's a value 
> dependent performance (and it kind of make sense that fft also 
> suffers from these values) 
>
> So now the question is: 
>
> 1. How important are underflow and denormal values? Note that I'm not 
> catching underflow explicitly anyway and I don't really care about 
> values that are really small compare to 1. 
>
> 2. Is there a way to set up the SSE registers as done by the c 
> compilers? @fastmath does not seems to be doing this. 
>
> 05b0 : 
>  5b0:0f ae 5c 24 fc   stmxcsr -0x4(%rsp) 
>  5b5:81 4c 24 fc 40 80 00 orl$0x8040,-0x4(%rsp) 
>  5bc:00 
>  5bd:0f ae 54 24 fc   ldmxcsr -0x4(%rsp) 
>  5c2:c3   retq 
>  5c3:66 2e 0f 1f 84 00 00 nopw   %cs:0x0(%rax,%rax,1) 
>  5ca:00 00 00 
>  5cd:0f 1f 00 nopl   (%rax) 
>
> [1] http://softpixel.com/~cwright/programming/simd/sse.php 
>
> > 
> > The slow down is presented in the c++ version for all optimization 
> > levels except Ofast and ffast-math. The julia version is faster than 
> > the default performance for both gcc and clang but is slower in the 
> > fast case for higher optmization levels. For O2 and higher, the c++ 
>
> The slowness of the julia version seems to be due to multi dimentional 
> arrays. Using 1d array yields similar performance with C. 
>
> > version shows a ~100x slow down for the slow case. 
> > 
> > @fast_math in julia doesn't seem to have an effect for this although 
> > it does for clang and gcc... 
> > 
> > [1] 
> https://github.com/yuyichao/explore/blob/5a644cd46dc6f8056cee69f508f9e995b5839a01/julia/array_prop/propagate.cpp
>  
> > 
> > On Sun, Jul 12, 2015 at 9:23 PM, Yichao Yu  > wrote: 
> >> Update: 
> >> 
> >> I've just got an even simpler version without any complex numbers and 
> >> only has Float64. The two loops are as small as the following LLVM-IR 
> >> now and there's only simple arithmetics in the loop body. 
> >> 
> >> ```llvm 
> >> L9.preheader: ; preds = %L12, 
> >> %L9.preheader.preheader 
> >>   %"#s3.0" = phi i64 [ %60, %L12 ], [ 1, %L9.preheader.preheader ] 
> >>   br label %L9 
> >> 
> >> L9:   ; preds = %L9, 
> %L9.preheader 
> >>   %"#s4.0" = phi i64 [ %44, %L9 ], [ 1, %L9.preheader ] 
> >>   %44 = add i64 %"#s4.0", 1 
> >>   %45 = add i64 %"#s4.0", -1 
> >>   %46 = mul i64 %45, %10 
> >>   %47 = getelementptr double* %7, i64 %46 
> >>   %48 = load double* %47, align 8 
> >>   %49 = add i64 %46, 1 
> >>   %50 = getelementptr double* %7, i64 %49 
> >>   %51 = load double* %50, align 8 
> >>   %52 = fmul double %51, %3 
> >>   %53 = fmul double %38, %48 
> >>   %54 = fmul double %33, %52 
> >>   %55 = fadd double %53, %54 
> >>   store double %55, double* %50, align 8 
> >>   %56 = fmul double %38, %52 
> >>   %57 = fmul double %33, %48 
> >>   %58 = fsub double %56, %57 
> >>   store double %58, double* %47, align 8 
> >>   %59 = icmp eq i64 %"#s4.0", %12 
> >>   br i1 %59, label %L12, label %L9 
> >> 
> >> L12:  ; preds = %L9 
> >>   %60 = add i64 %"#s3.0", 1 
> >>   %61 = icmp eq i64 %"#s3.0", %42 
> >>   br i1 %61, label %L14.loopexit, label %L9.preheader 
> >> ``` 
> >> 
> >> On Sun, Jul 12, 2015 at 9:01 PM, Yichao Yu  > wrote: 
> >>> On Sun, Jul 12, 2015 at 8:31 PM, 

Re: [julia-users] Re: Strange performance problem for array scaling

2015-07-12 Thread Jeffrey Sarnoff
and this: Cleve Moler tries to see it your way 
Moler on floating point denormals 


On Monday, July 13, 2015 at 2:11:22 AM UTC-4, Jeffrey Sarnoff wrote:
>
> Denormals were made part of the IEEE Floating Point standard after some 
> very careful numerical analysis showed that accomodating them would 
> substantively improve the quality of floating point results and this would 
> lift the quality of all floating point work. Surprising it may be, 
> nonetheless you (and if not you today, you tomorrow and one of your 
> neighbors today) really do care about those unusual, and often rarely 
> observed values.
>
> fyi
> William Kahan on the introduction of denormals to the standard 
> 
> and an early, important paper on this
> Effects of Underflow on Solving Linear Systems - J.Demmel 1981 
> 
>   
>
> On Monday, July 13, 2015 at 12:35:24 AM UTC-4, Yichao Yu wrote:
>>
>> On Sun, Jul 12, 2015 at 10:30 PM, Yichao Yu  wrote: 
>> > Further update: 
>> > 
>> > I made a c++ version[1] and see a similar effect (depending on 
>> > optimization levels) so it's not a julia issue (not that I think it 
>> > really was to begin with...). 
>>
>> After investigating the c++ version more, I find that the difference 
>> between the fast_math and the non-fast_math version is that the 
>> compiler emit a function called `set_fast_math` (see below). 
>>
>> From what I can tell, the function sets bit 6 and bit 15 on the MXCSR 
>> register (for SSE) and according to this page[1] these are DAZ and FZ 
>> bits (both related to underflow). It also describe denormals as "take 
>> considerably longer to process". Since the operation I have keeps 
>> decreasing the value, I guess it makes sense that there's a value 
>> dependent performance (and it kind of make sense that fft also 
>> suffers from these values) 
>>
>> So now the question is: 
>>
>> 1. How important are underflow and denormal values? Note that I'm not 
>> catching underflow explicitly anyway and I don't really care about 
>> values that are really small compare to 1. 
>>
>> 2. Is there a way to set up the SSE registers as done by the c 
>> compilers? @fastmath does not seems to be doing this. 
>>
>> 05b0 : 
>>  5b0:0f ae 5c 24 fc   stmxcsr -0x4(%rsp) 
>>  5b5:81 4c 24 fc 40 80 00 orl$0x8040,-0x4(%rsp) 
>>  5bc:00 
>>  5bd:0f ae 54 24 fc   ldmxcsr -0x4(%rsp) 
>>  5c2:c3   retq 
>>  5c3:66 2e 0f 1f 84 00 00 nopw   %cs:0x0(%rax,%rax,1) 
>>  5ca:00 00 00 
>>  5cd:0f 1f 00 nopl   (%rax) 
>>
>> [1] http://softpixel.com/~cwright/programming/simd/sse.php 
>>
>> > 
>> > The slow down is presented in the c++ version for all optimization 
>> > levels except Ofast and ffast-math. The julia version is faster than 
>> > the default performance for both gcc and clang but is slower in the 
>> > fast case for higher optmization levels. For O2 and higher, the c++ 
>>
>> The slowness of the julia version seems to be due to multi dimentional 
>> arrays. Using 1d array yields similar performance with C. 
>>
>> > version shows a ~100x slow down for the slow case. 
>> > 
>> > @fast_math in julia doesn't seem to have an effect for this although 
>> > it does for clang and gcc... 
>> > 
>> > [1] 
>> https://github.com/yuyichao/explore/blob/5a644cd46dc6f8056cee69f508f9e995b5839a01/julia/array_prop/propagate.cpp
>>  
>> > 
>> > On Sun, Jul 12, 2015 at 9:23 PM, Yichao Yu  wrote: 
>> >> Update: 
>> >> 
>> >> I've just got an even simpler version without any complex numbers and 
>> >> only has Float64. The two loops are as small as the following LLVM-IR 
>> >> now and there's only simple arithmetics in the loop body. 
>> >> 
>> >> ```llvm 
>> >> L9.preheader: ; preds = %L12, 
>> >> %L9.preheader.preheader 
>> >>   %"#s3.0" = phi i64 [ %60, %L12 ], [ 1, %L9.preheader.preheader ] 
>> >>   br label %L9 
>> >> 
>> >> L9:   ; preds = %L9, 
>> %L9.preheader 
>> >>   %"#s4.0" = phi i64 [ %44, %L9 ], [ 1, %L9.preheader ] 
>> >>   %44 = add i64 %"#s4.0", 1 
>> >>   %45 = add i64 %"#s4.0", -1 
>> >>   %46 = mul i64 %45, %10 
>> >>   %47 = getelementptr double* %7, i64 %46 
>> >>   %48 = load double* %47, align 8 
>> >>   %49 = add i64 %46, 1 
>> >>   %50 = getelementptr double* %7, i64 %49 
>> >>   %51 = load double* %50, align 8 
>> >>   %52 = fmul double %51, %3 
>> >>   %53 = fmul double %38, %48 
>> >>   %54 = fmul double %33, %52 
>> >>   %55 = fadd double %53, %54 
>> >>   store double %55, double* %50, align 8 
>> >>   %56 = fmul double %38, %52 
>> >>   %57 = fmul double %33, %48 
>> >>   %58 = fsub double %56, %57 
>> >>   store double %58, double* %47, align 8 
>> >>   %59 = icmp eq i6