Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?
On Friday, May 1, 2015 at 5:23:40 PM UTC, Steven G. Johnson wrote: On Friday, May 1, 2015 at 1:12:00 PM UTC-4, Steven Sagaert wrote: That wasn't what I was saying. I like the philosophy behind julia. But in practice (as of now) even in julia you still have to code in a certain style if you want very good performance and that's no different than in any other language. The goal of Julia is not to be a language in which it is *impossible* to write slow code, or a language in which all programming styles are equally fast. The goal (or at least, one of the goals) is to be an expressive, high-level dynamic language, in which it is also *possible* to write performance-critical inner-loop code. *Summary* Thanks (all) for answering. I agree that *possible* to write fast code is a goal. I believe that has been achieved. Nobody commented much on my list of concerns.. Yes, of course *impossible* to write slow code is a very high bar.. I just thought Python - an interpreted language - wasn't a high bar :) I'm just using that as a comparison. I would like (newbie) Julia code not be beaten by (core language) Python. Or not at least by much (a constant factor). Has that been achieved? I noticed the yes/no answer on Any. Global no longer a problem? Yes, gets you slow code but compared to Python? Tuples/Dict now as fast? [I just noticed Named tuples thread.] Then there are of course, say, Python libraries that are faster to non-exciting Julia ones.. My hope is through PyCall you can use them all (I understand that to be the case) - without speed penalty. We may still have the two/N-language problem for a while, for functionality reasons but not speed-reasons.. The dual Julia/Python is much preferred problem I think to Julia/C or Python/C.. and gets you all the batteries included you would want (speaking as a non-math user). Great to see that strings are being worked on, I never wanted this thread to be just about one thing. I can now see how RefCounting in Python helps strings.. I'm also looking into how to beat Python there.. That *is* different from other high-level languages, in which it is typically *not* possible to write performance-critical inner-loop code without dropping down to a lower-level language (C, Fortran, Cython...). If you are coding exclusively in Python or R, and there isn't an optimized function appropriate for the innermost loops of your task at hand, you are out of luck.
Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?
I wasn't trying to say that it was specific to strings, I was saying that it is not specific to I/O, which the name would seem to indicate... and it keeps getting brought up as something that should be used for basic mutable string operations. On Sunday, May 3, 2015 at 3:20:43 PM UTC-4, Tamas Papp wrote: consider let io = IOBuffer() write(io,rand(10)) takebuf_array(io) end IOBuffer() is not specific to strings at all. Best, Tamas On Sun, May 03 2015, Scott Jones scott.pa...@gmail.com javascript: wrote: Because you can have binary strings and text strings... there is even a special literal for binary strings... b\xffThis is a binary\x01\string This is a \u307 text string Calling it an IOBuffer makes it sound like it is specific to I/O, not just strings (binary or text) that you might never do I/O on... On Sunday, May 3, 2015 at 2:43:14 PM UTC-4, Kristoffer Carlsson wrote: Why should it be called StringBuffer when another common use of it is to write raw binary data?
Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?
On Sunday, May 3, 2015 at 6:10:00 PM UTC-4, Kevin Squire wrote: One thing I was confused about when I first started using Julia was that things that are done with strings in other languages are often done directly with IO objects in Julia. For example, consider that, in Python, most classes define `__str__()` and `__repr__()`, which create string representations of objects of this class (the first more meant for human consumption, the second for parsing (usually)). In Julia, the implicit assumption is that most strings are meant for output in some way, so why not skip the extra memory allocation and write the string representation directly to output. For this, types define `show(io::IO, x::MyType)`. If you really want to manipulate such strings, you can (as pointed out in this thread) go through an IOBuffer object first. (There is also `repr(x::SomeType)`, but it's not emphasized as much.) Problem is, with what I'm doing, the strings are almost never written to output... they are analyzed, modified, stored and retrieved from a database... and you want all the normal string operations... you might be doing regex search/replace, for example... and for performance reasons, you don't want to be converting to an immutable string all the time. This was a design decision made early on. I personally found (and still find) it somewhat awkward at times, but for many things, it works fine, and (seemingly) it lets most string output allocate less memory by default. Now, it certainly is the case that mutable strings may be very useful in some contexts. The BioSeq.jl package implements mutable DNA and protein sequences, which are very useful there, and would be represented by mutable strings in many other languages. The best way to test that would probably be to create a package (say, MutableStrings.jl), and define useful types and functions there. There are a few things I'd like to add to Julia wrt strings, validated strings (right now, it is a bit of a mishmash as to whether or not convert functions will accept invalid Unicode data), and mutable strings... Somebody already did create a MutableStrings.jl, however it is broken, it doesn't look like it has been updated in over a year, and is only for ASCII and UTF-8, it doesn't have UTF-16 or UTF-32 mutable strings... (and I also want mutable 8-bit (ANSI Latin 1) strings and UCS-2 strings (i.e. UTF-16 with no surrogates) [that is so that it would be a DirectIndexString, to get O(1) instead of O(n) for some operations].) Cheers, Kevin
Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?
I think you misunderstand: IOBuffer is suggested not for mutable string operations in general, but only for efficient concatenation of many strings. Best, Tamas On Mon, May 04 2015, Scott Jones scott.paul.jo...@gmail.com wrote: I wasn't trying to say that it was specific to strings, I was saying that it is not specific to I/O, which the name would seem to indicate... and it keeps getting brought up as something that should be used for basic mutable string operations. On Sunday, May 3, 2015 at 3:20:43 PM UTC-4, Tamas Papp wrote: consider let io = IOBuffer() write(io,rand(10)) takebuf_array(io) end IOBuffer() is not specific to strings at all. Best, Tamas On Sun, May 03 2015, Scott Jones scott.pa...@gmail.com javascript: wrote: Because you can have binary strings and text strings... there is even a special literal for binary strings... b\xffThis is a binary\x01\string This is a \u307 text string Calling it an IOBuffer makes it sound like it is specific to I/O, not just strings (binary or text) that you might never do I/O on... On Sunday, May 3, 2015 at 2:43:14 PM UTC-4, Kristoffer Carlsson wrote: Why should it be called StringBuffer when another common use of it is to write raw binary data?
Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?
On May 4, 2015, at 3:21 AM, Tamas Papp tkp...@gmail.com wrote: I think you misunderstand: IOBuffer is suggested not for mutable string operations in general, but only for efficient concatenation of many strings. Best, Tamas I don’t think that I misunderstood - it’s that using IOBuffer is the only solution that has been given here… and it doesn’t handle what I need to do efficiently... If you have a better solution, please let me know… Scott On Mon, May 04 2015, Scott Jones scott.paul.jo...@gmail.com mailto:scott.paul.jo...@gmail.com wrote: I wasn't trying to say that it was specific to strings, I was saying that it is not specific to I/O, which the name would seem to indicate... and it keeps getting brought up as something that should be used for basic mutable string operations. On Sunday, May 3, 2015 at 3:20:43 PM UTC-4, Tamas Papp wrote: consider let io = IOBuffer() write(io,rand(10)) takebuf_array(io) end IOBuffer() is not specific to strings at all. Best, Tamas On Sun, May 03 2015, Scott Jones scott.pa...@gmail.com http://gmail.com/ javascript: wrote: Because you can have binary strings and text strings... there is even a special literal for binary strings... b\xffThis is a binary\x01\string This is a \u307 text string Calling it an IOBuffer makes it sound like it is specific to I/O, not just strings (binary or text) that you might never do I/O on... On Sunday, May 3, 2015 at 2:43:14 PM UTC-4, Kristoffer Carlsson wrote: Why should it be called StringBuffer when another common use of it is to write raw binary data?
Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?
On Mon, May 04 2015, Scott Jones scott.paul.jo...@gmail.com wrote: On May 4, 2015, at 3:21 AM, Tamas Papp tkp...@gmail.com wrote: I think you misunderstand: IOBuffer is suggested not for mutable string operations in general, but only for efficient concatenation of many strings. Best, Tamas I don’t think that I misunderstood - it’s that using IOBuffer is the only solution that has been given here… and it doesn’t handle what I need to do efficiently... If you have a better solution, please let me know… 1. Can you share the benchmarks (and simplified, self-contained code) for your problem using IOBuffer? I have always found it very fast, but maybe what you are working on is different. 2. Do you have a specific algorithm in mind that would be more efficient? Best, Tamas
Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?
I should be clear, I didn't mean that all strings should be immutable, but rather that I also want to have mutable strings available... There is a package for that, but 1) I think it's incomplete (I may need to contribute to it), and 2) I think that they do belong in the base language... CLU had both, which was very nice... For many things, IOBuffer is exactly the right way of doing things (the name is misleading though... Maybe it should have been StringBuffer...), but there are use cases where you are constantly modifying the string while performing other string operations on it...
Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?
You really should ask the language designers about this for a definite answer but (one of the ) the reason(s) strings are immutable in julia (and in Java others) is that it makes them good keys for Dicts. On Saturday, May 2, 2015 at 7:16:24 PM UTC+2, Jameson wrote: IOBuffer does not inherit from string, nor does it implement any of the methods expected of a mutable string (length, endof, insert! / splice! / append!). If you want strings that support all of those operations, then you will need something different from an IOBuffer. If you just wanted a fast string builder, then IOBuffer is the right abstraction (ending with a call to `takebuf_string!`). This dichotomy helps to give a clear distinction in the code between the construction phase and usage phase. On Sat, May 2, 2015 at 12:49 PM Páll Haraldsson pall.ha...@gmail.com javascript: wrote: 2015-05-01 16:42 GMT+00:00 Steven G. Johnson steve...@gmail.com javascript:: In Julia, Ruby, Java, Go, and many other languages, concatenation allocates a new string and hence building a string by repeated concatenation is O(n^2). That doesn't mean that those other languages lose on string processing to Python, it just means that you have to do things slightly differently (e.g. write to an IOBuffer in Julia). You can't always expect the *same code* (translated as literally as possible) to be the optimal approach in different languages, and it is inflammatory to compare languages according to this standard. A fairer question is whether it is *much harder* to get good performance in one language vs. another for a certain task. There will certainly be tasks where Python is still superior in this sense simply because there are many cases where Python calls highly tuned C libraries for operations that have not been as optimized in Julia. Julia will tend to shine the further you stray from built-in operations in your performance-critical code. What I would like to know is do you need to make your own string type to make Julia as fast (by a constant factor) to say Python. In another answer IOBuffer was said to be not good enough. -- Palli.
Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?
One thing I was confused about when I first started using Julia was that things that are done with strings in other languages are often done directly with IO objects in Julia. For example, consider that, in Python, most classes define `__str__()` and `__repr__()`, which create string representations of objects of this class (the first more meant for human consumption, the second for parsing (usually)). In Julia, the implicit assumption is that most strings are meant for output in some way, so why not skip the extra memory allocation and write the string representation directly to output. For this, types define `show(io::IO, x::MyType)`. If you really want to manipulate such strings, you can (as pointed out in this thread) go through an IOBuffer object first. (There is also `repr(x::SomeType)`, but it's not emphasized as much.) This was a design decision made early on. I personally found (and still find) it somewhat awkward at times, but for many things, it works fine, and (seemingly) it lets most string output allocate less memory by default. Now, it certainly is the case that mutable strings may be very useful in some contexts. The BioSeq.jl package implements mutable DNA and protein sequences, which are very useful there, and would be represented by mutable strings in many other languages. The best way to test that would probably be to create a package (say, MutableStrings.jl), and define useful types and functions there. Cheers, Kevin On Sun, May 3, 2015 at 12:20 PM, Tamas Papp tkp...@gmail.com wrote: consider let io = IOBuffer() write(io,rand(10)) takebuf_array(io) end IOBuffer() is not specific to strings at all. Best, Tamas On Sun, May 03 2015, Scott Jones scott.paul.jo...@gmail.com wrote: Because you can have binary strings and text strings... there is even a special literal for binary strings... b\xffThis is a binary\x01\string This is a \u307 text string Calling it an IOBuffer makes it sound like it is specific to I/O, not just strings (binary or text) that you might never do I/O on... On Sunday, May 3, 2015 at 2:43:14 PM UTC-4, Kristoffer Carlsson wrote: Why should it be called StringBuffer when another common use of it is to write raw binary data?
Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?
Why should it be called StringBuffer when another common use of it is to write raw binary data?
Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?
consider let io = IOBuffer() write(io,rand(10)) takebuf_array(io) end IOBuffer() is not specific to strings at all. Best, Tamas On Sun, May 03 2015, Scott Jones scott.paul.jo...@gmail.com wrote: Because you can have binary strings and text strings... there is even a special literal for binary strings... b\xffThis is a binary\x01\string This is a \u307 text string Calling it an IOBuffer makes it sound like it is specific to I/O, not just strings (binary or text) that you might never do I/O on... On Sunday, May 3, 2015 at 2:43:14 PM UTC-4, Kristoffer Carlsson wrote: Why should it be called StringBuffer when another common use of it is to write raw binary data?
Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?
Because you can have binary strings and text strings... there is even a special literal for binary strings... b\xffThis is a binary\x01\string This is a \u307 text string Calling it an IOBuffer makes it sound like it is specific to I/O, not just strings (binary or text) that you might never do I/O on... On Sunday, May 3, 2015 at 2:43:14 PM UTC-4, Kristoffer Carlsson wrote: Why should it be called StringBuffer when another common use of it is to write raw binary data?
Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?
IOBuffer does not inherit from string, nor does it implement any of the methods expected of a mutable string (length, endof, insert! / splice! / append!). If you want strings that support all of those operations, then you will need something different from an IOBuffer. If you just wanted a fast string builder, then IOBuffer is the right abstraction (ending with a call to `takebuf_string!`). This dichotomy helps to give a clear distinction in the code between the construction phase and usage phase. On Sat, May 2, 2015 at 12:49 PM Páll Haraldsson pall.haralds...@gmail.com wrote: 2015-05-01 16:42 GMT+00:00 Steven G. Johnson stevenj@gmail.com: In Julia, Ruby, Java, Go, and many other languages, concatenation allocates a new string and hence building a string by repeated concatenation is O(n^2). That doesn't mean that those other languages lose on string processing to Python, it just means that you have to do things slightly differently (e.g. write to an IOBuffer in Julia). You can't always expect the *same code* (translated as literally as possible) to be the optimal approach in different languages, and it is inflammatory to compare languages according to this standard. A fairer question is whether it is *much harder* to get good performance in one language vs. another for a certain task. There will certainly be tasks where Python is still superior in this sense simply because there are many cases where Python calls highly tuned C libraries for operations that have not been as optimized in Julia. Julia will tend to shine the further you stray from built-in operations in your performance-critical code. What I would like to know is do you need to make your own string type to make Julia as fast (by a constant factor) to say Python. In another answer IOBuffer was said to be not good enough. -- Palli.
Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?
2015-05-01 16:42 GMT+00:00 Steven G. Johnson stevenj@gmail.com: In Julia, Ruby, Java, Go, and many other languages, concatenation allocates a new string and hence building a string by repeated concatenation is O(n^2). That doesn't mean that those other languages lose on string processing to Python, it just means that you have to do things slightly differently (e.g. write to an IOBuffer in Julia). You can't always expect the *same code* (translated as literally as possible) to be the optimal approach in different languages, and it is inflammatory to compare languages according to this standard. A fairer question is whether it is *much harder* to get good performance in one language vs. another for a certain task. There will certainly be tasks where Python is still superior in this sense simply because there are many cases where Python calls highly tuned C libraries for operations that have not been as optimized in Julia. Julia will tend to shine the further you stray from built-in operations in your performance-critical code. What I would like to know is do you need to make your own string type to make Julia as fast (by a constant factor) to say Python. In another answer IOBuffer was said to be not good enough. -- Palli.
Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?
If you are coding exclusively in Python or R, and there isn't an optimized function appropriate for the innermost loops of your task at hand, you are out of luck. This is the important key takehome message, Julia is intended to allow both quick and simple and interactive and dynamic and optimised and fast code to written in one language. I think Stefan announced Julia as we want it all :)
Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?
On Friday, May 1, 2015 at 9:16:43 AM UTC-4, Tim Holy wrote: On Friday, May 01, 2015 03:19:03 AM Scott Jones wrote: As the string grows, Julia's internals end up having to reallocate the memory and sometimes copy it to a new location, hence the O(n^2) nature of the code. Small correction: push! is not O(n^2), it's O(nlogn). Internally, the storage array grows by factors of 2 [1]; after one allocation of size 2n you can add n more elements without reallocating. Good to know, I hate to say it, but the performance looked so bad to me, I didn't bother to see if it even had that optimization (which is exactly what I did for strings for the language I used to develop) Does it always grow by factors of 2? That might not be so good... we found that after a certain point, it was better to increase in chunks, say of 64K, or 1M, because increasing the size that way of large LOBs could make you run out of memory fairly quickly... That said, O(nlogn) can be pretty easily beat by O(2n): make one pass through and count how many you'll need, allocate the whole thing, and then stuff in elements. As you seem to be planning to do. Yes, and have very nice performance improvements to show for it (most were around 4-10x faster, go look at what I put in my gist), and that's even with my pure Julia version... :-) --Tim [1] Last I looked, that is; there was some discussion about switching it to something like 1.5 because of various discussions of memory fragmentation and reuse. Still, same issue as I described above... probably better to increase by 2x up to a point, and then by chunk sizes, where the chunk sizes might slowly get larger...
Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?
On Friday, May 01, 2015 08:03:31 AM Scott Jones wrote: Still, same issue as I described above... probably better to increase by 2x up to a point, and then by chunk sizes, where the chunk sizes might slowly get larger... I see your point, but it will also break the O(nlogn) scaling. We couldn't hard-code the cutoff, because some people run julia on machines with 4GB of RAM and others with 1TB of RAM. So, we could query the amount of RAM available and switch based on that result, but since all this would only make a difference for operations that consume between 0.5x and 1x the user's RAM (which to me seems like a very narrow window, on the log scale), is it really worth the trouble? --Tim
Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?
Steven -- I agree and I find it very refreshing that you're willing to judge a language by more than just performance. Any given language can always be optimized better, so ideally you want to compare them by more robust criteria. Obviously a particular system might have a well-tuned library routine that's faster than our equivalent. But think about it: is having a slow interpreter, and relying on code to spend all its time in pre-baked library kernels the *right* way to get performance? That's just the same boring design that has been used over and over again, in matlab, IDL, octave, R, etc. In those cases the language isn't bringing much to the table, except a pile of rules about how important code must still be written in C/Fortran, and how your code must be vectorized or shame on you. On Fri, May 1, 2015 at 11:48 AM, Tim Holy tim.h...@gmail.com wrote: On Friday, May 01, 2015 08:03:31 AM Scott Jones wrote: Still, same issue as I described above... probably better to increase by 2x up to a point, and then by chunk sizes, where the chunk sizes might slowly get larger... I see your point, but it will also break the O(nlogn) scaling. We couldn't hard-code the cutoff, because some people run julia on machines with 4GB of RAM and others with 1TB of RAM. So, we could query the amount of RAM available and switch based on that result, but since all this would only make a difference for operations that consume between 0.5x and 1x the user's RAM (which to me seems like a very narrow window, on the log scale), is it really worth the trouble? --Tim
Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?
The threshold would likely be most beneficial if it was based on pagesize (which is constant relative to RAM size). For small allocations (less than several megabytes), a modern malloc implementation typically uses a pool, so growing a allocation (except by a small amount) will probably result in a copy anyways, and no memory reuse. Once malloc switches to direct mmap calls, then it probably makes sense to add pages at a more gradual rate. On Fri, May 1, 2015 at 11:48 AM Tim Holy tim.h...@gmail.com wrote: On Friday, May 01, 2015 08:03:31 AM Scott Jones wrote: Still, same issue as I described above... probably better to increase by 2x up to a point, and then by chunk sizes, where the chunk sizes might slowly get larger... I see your point, but it will also break the O(nlogn) scaling. We couldn't hard-code the cutoff, because some people run julia on machines with 4GB of RAM and others with 1TB of RAM. So, we could query the amount of RAM available and switch based on that result, but since all this would only make a difference for operations that consume between 0.5x and 1x the user's RAM (which to me seems like a very narrow window, on the log scale), is it really worth the trouble? --Tim
Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?
Of course I'm not saying loops should not be benchmarked and I do use loops in julia also. I'm just saying that when doing performance comparison one should try to write the programs in each language in their most optimal style rather than similar style which is optimal for one language but very suboptimal in another language. Ah I didn't know the article was rebutted by Stefan. I read that article before that happened and just looked it up again now as an example. I guess the conclusion is that cross-language performance benchmarks are very tricky which was kinda my point :) On Friday, May 1, 2015 at 3:13:24 PM UTC+2, Tim Holy wrote: Hi Steven, I understand your point---you're saying you'd be unlikely to write those algorithms in that manner, if your goal were to do those particular computations. But the important point to keep in mind is that those benchmarks are simply toys for the purpose of testing performance of various language constructs. If you think it's irrelevant to benchmark loops for scientific code, then you do very, very different stuff than me. Not all algorithms reduce to BLAS calls. I use julia to write all kinds of algorithms that I used to write MEX functions for, back in my Matlab days. If all you need is A*b, then of course basically any scientific language will be just fine, with minimal differences in performance. Moreover, that R benchmark on cumsum is simply not credible. I'm not sure what was happening (and that article doesn't post its code or procedures used to test), but julia's cumsum reduces to efficient machine code (basically, a bunch of addition operations). If they were computing cumsum across a specific dimension, then this PR: https://github.com/JuliaLang/julia/pull/7359 changed things. But more likely, someone forgot to run the code twice (so it got JIT-compiled), had a type-instability in the code they were testing, or some other mistake. It's too bad one can make mistakes, of course, but then it becomes a comparison of different programmers rather than different programming languages. Indeed, if you read the comments in that post, Stefan already rebutted that benchmark, with a 4x advantage for Julia: https://matloff.wordpress.com/2014/05/21/r-beats-python-r-beats-julia-anyone-else-wanna-challenge-r/comment-page-1/#comment-89 --Tim On Friday, May 01, 2015 01:25:50 AM Steven Sagaert wrote: I think the performance comparisons between Julia Python are flawed. They seem to be between standard Python Julia but since Julia is all about scientific programming it really should be between SciPi Julia. Since SciPi uses much of the same underlying libs in Fortran/C the performance gap will be much smaller and to be really fair it should be between numba compiled SciPi code julia. I suspect the performance will be very close then (and close to C performance). Similarly the standard benchmark (on the opening page of julia website) between R julia is also flawed because it takes the best case scenario for julia (loops mutable datastructures) the worst case scenario for R. When the same R program is rewritten in vectorised style it beat julia see https://matloff.wordpress.com/2014/05/21/r-beats-python-r-beats-julia-anyon e-else-wanna-challenge-r/. So my interest in julia isn't because it is the fastest scientific high level language (because clearly at this stage you can't really claim that) but because it's a clean interesting language (still needs work for some rough edges of course) with clean(er) clear(er) libraries and that gives reasonable performance out of the box without much tweaking. On Friday, May 1, 2015 at 12:10:58 AM UTC+2, Scott Jones wrote: Yes... Python will win on string processing... esp. with Python 3... I quickly ran into things that were 800x faster in Python... (I hope to help change that though!) Scott On Thursday, April 30, 2015 at 6:01:45 PM UTC-4, Páll Haraldsson wrote: I wouldn't expect a difference in Julia for code like that (didn't check). But I guess what we are often seeing is someone comparing a tuned Python code to newbie Julia code. I still want it faster than that code.. (assuming same algorithm, note row vs. column major caveat). The main point of mine, *should* Python at any time win? 2015-04-30 21:36 GMT+00:00 Sisyphuss zhengw...@gmail.com: This post interests me. I'll write something here to follow this post. The performance gap between normal code in Python and badly-written code in Julia is something I'd like to know too. As far as I know, Python interpret does some mysterious optimizations. For example `(x**2)**2` is 100x faster than `x**4`. On Thursday, April 30, 2015 at 9:58:35 PM UTC+2, Páll Haraldsson wrote: Hi, [As a best
Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?
I'll quote one of my comments on this StackOverflow question http://stackoverflow.com/questions/9968578/speeding-up-julias-poorly-written-r-examples : That all depends on what you are trying to measure. Personally, I'm not at all interested in how fast one can compute Fibonacci numbers. Yet that is one of our benchmarks. Why? Because I am very interested in how well languages support recursion – and the doubly recursive algorithm happens to be a great test of recursion, precisely because it is such a terrible way to compute Fibonacci numbers. So what would be learned by comparing an intentionally slow, excessively recursive algorithm in C and Julia against a tricky, clever, vectorized algorithm in R? Nothing at all. On Fri, May 1, 2015 at 12:58 PM, Steven Sagaert steven.saga...@gmail.com wrote: Of course I'm not saying loops should not be benchmarked and I do use loops in julia also. I'm just saying that when doing performance comparison one should try to write the programs in each language in their most optimal style rather than similar style which is optimal for one language but very suboptimal in another language. Ah I didn't know the article was rebutted by Stefan. I read that article before that happened and just looked it up again now as an example. I guess the conclusion is that cross-language performance benchmarks are very tricky which was kinda my point :) On Friday, May 1, 2015 at 3:13:24 PM UTC+2, Tim Holy wrote: Hi Steven, I understand your point---you're saying you'd be unlikely to write those algorithms in that manner, if your goal were to do those particular computations. But the important point to keep in mind is that those benchmarks are simply toys for the purpose of testing performance of various language constructs. If you think it's irrelevant to benchmark loops for scientific code, then you do very, very different stuff than me. Not all algorithms reduce to BLAS calls. I use julia to write all kinds of algorithms that I used to write MEX functions for, back in my Matlab days. If all you need is A*b, then of course basically any scientific language will be just fine, with minimal differences in performance. Moreover, that R benchmark on cumsum is simply not credible. I'm not sure what was happening (and that article doesn't post its code or procedures used to test), but julia's cumsum reduces to efficient machine code (basically, a bunch of addition operations). If they were computing cumsum across a specific dimension, then this PR: https://github.com/JuliaLang/julia/pull/7359 changed things. But more likely, someone forgot to run the code twice (so it got JIT-compiled), had a type-instability in the code they were testing, or some other mistake. It's too bad one can make mistakes, of course, but then it becomes a comparison of different programmers rather than different programming languages. Indeed, if you read the comments in that post, Stefan already rebutted that benchmark, with a 4x advantage for Julia: https://matloff.wordpress.com/2014/05/21/r-beats-python-r-beats-julia-anyone-else-wanna-challenge-r/comment-page-1/#comment-89 --Tim On Friday, May 01, 2015 01:25:50 AM Steven Sagaert wrote: I think the performance comparisons between Julia Python are flawed. They seem to be between standard Python Julia but since Julia is all about scientific programming it really should be between SciPi Julia. Since SciPi uses much of the same underlying libs in Fortran/C the performance gap will be much smaller and to be really fair it should be between numba compiled SciPi code julia. I suspect the performance will be very close then (and close to C performance). Similarly the standard benchmark (on the opening page of julia website) between R julia is also flawed because it takes the best case scenario for julia (loops mutable datastructures) the worst case scenario for R. When the same R program is rewritten in vectorised style it beat julia see https://matloff.wordpress.com/2014/05/21/r-beats-python-r-beats-julia-anyon e-else-wanna-challenge-r/. So my interest in julia isn't because it is the fastest scientific high level language (because clearly at this stage you can't really claim that) but because it's a clean interesting language (still needs work for some rough edges of course) with clean(er) clear(er) libraries and that gives reasonable performance out of the box without much tweaking. On Friday, May 1, 2015 at 12:10:58 AM UTC+2, Scott Jones wrote: Yes... Python will win on string processing... esp. with Python 3... I quickly ran into things that were 800x faster in Python... (I hope to help change that though!) Scott On Thursday, April 30, 2015 at 6:01:45 PM UTC-4, Páll Haraldsson wrote: I wouldn't expect a difference in Julia for code like that (didn't check). But I guess what we are often seeing is someone
Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?
On Thursday, April 30, 2015 at 6:10:58 PM UTC-4, Scott Jones wrote: Yes... Python will win on string processing... esp. with Python 3... I quickly ran into things that were 800x faster in Python... (I hope to help change that though!) The 800x faster example that you've referred to several times, if I recall correctly, is one where you repeatedly concatenate strings. In CPython, under certain circumstances, this is optimized to mutating one of the strings in-place and is consequently O(n) where n is the final length, although this is not guaranteed by the language itself. In Julia, Ruby, Java, Go, and many other languages, concatenation allocates a new string and hence building a string by repeated concatenation is O(n^2). That doesn't mean that those other languages lose on string processing to Python, it just means that you have to do things slightly differently (e.g. write to an IOBuffer in Julia). You can't always expect the *same code* (translated as literally as possible) to be the optimal approach in different languages, and it is inflammatory to compare languages according to this standard. A fairer question is whether it is *much harder* to get good performance in one language vs. another for a certain task. There will certainly be tasks where Python is still superior in this sense simply because there are many cases where Python calls highly tuned C libraries for operations that have not been as optimized in Julia. Julia will tend to shine the further you stray from built-in operations in your performance-critical code.
Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?
On Friday, May 1, 2015 at 1:12:00 PM UTC-4, Steven Sagaert wrote: That wasn't what I was saying. I like the philosophy behind julia. But in practice (as of now) even in julia you still have to code in a certain style if you want very good performance and that's no different than in any other language. The goal of Julia is not to be a language in which it is *impossible* to write slow code, or a language in which all programming styles are equally fast. The goal (or at least, one of the goals) is to be an expressive, high-level dynamic language, in which it is also *possible* to write performance-critical inner-loop code. That *is* different from other high-level languages, in which it is typically *not* possible to write performance-critical inner-loop code without dropping down to a lower-level language (C, Fortran, Cython...). If you are coding exclusively in Python or R, and there isn't an optimized function appropriate for the innermost loops of your task at hand, you are out of luck.
Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?
Obviously a particular system might have a well-tuned library routine that's faster than our equivalent. But think about it: is having a slow interpreter, and relying on code to spend all its time in pre-baked library kernels the *right* way to get performance? That's just the same boring design that has been used over and over again, in matlab, IDL, octave, R, etc. In those cases the language isn't bringing much to the table, except a pile of rules about how important code must still be written in C/Fortran, and how your code must be vectorized or shame on you. That wasn't what I was saying. I like the philosophy behind julia. But in practice (as of now) even in julia you still have to code in a certain style if you want very good performance and that's no different than in any other language. Ideally of course the compiler should be able to optimize the code so that different styles (e.g. functional/vectorized style vs imperative/loops style) gives the same performance and the programmer doesn't have to think about it and maybe one day it will be like that in julia but we're not quite there yet AFAIK. Having said that, I like Julia and hopefully it will keep on getting better/faster. So good job and keep up the good work. On Fri, May 1, 2015 at 11:48 AM, Tim Holy tim@gmail.com javascript: wrote: On Friday, May 01, 2015 08:03:31 AM Scott Jones wrote: Still, same issue as I described above... probably better to increase by 2x up to a point, and then by chunk sizes, where the chunk sizes might slowly get larger... I see your point, but it will also break the O(nlogn) scaling. We couldn't hard-code the cutoff, because some people run julia on machines with 4GB of RAM and others with 1TB of RAM. So, we could query the amount of RAM available and switch based on that result, but since all this would only make a difference for operations that consume between 0.5x and 1x the user's RAM (which to me seems like a very narrow window, on the log scale), is it really worth the trouble? --Tim
Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?
On Friday, May 1, 2015 at 1:23:40 PM UTC-4, Steven G. Johnson wrote: On Friday, May 1, 2015 at 1:12:00 PM UTC-4, Steven Sagaert wrote: That wasn't what I was saying. I like the philosophy behind julia. But in practice (as of now) even in julia you still have to code in a certain style if you want very good performance and that's no different than in any other language. The goal of Julia is not to be a language in which it is *impossible* to write slow code, or a language in which all programming styles are equally fast. The goal (or at least, one of the goals) is to be an expressive, high-level dynamic language, in which it is also *possible* to write performance-critical inner-loop code. Yep, totally agree! I had to deal with people (smart people too, who went to MIT also ;-) ) who expected the compiler/interpreter to magically improve their O(n^2) code! That *is* different from other high-level languages, in which it is typically *not* possible to write performance-critical inner-loop code without dropping down to a lower-level language (C, Fortran, Cython...). If you are coding exclusively in Python or R, and there isn't an optimized function appropriate for the innermost loops of your task at hand, you are out of luck. Also, very true... I do hope that any issues that make my C version of UTF conversion routines faster than my equivalent Julia versions will be addressed before too long. (and I don't even think it is that far off, or hard for any particular reason)
Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?
On Friday, May 1, 2015 at 11:48:21 AM UTC-4, Tim Holy wrote: On Friday, May 01, 2015 08:03:31 AM Scott Jones wrote: Still, same issue as I described above... probably better to increase by 2x up to a point, and then by chunk sizes, where the chunk sizes might slowly get larger... I see your point, but it will also break the O(nlogn) scaling. We couldn't hard-code the cutoff, because some people run julia on machines with 4GB of RAM and others with 1TB of RAM. So, we could query the amount of RAM available and switch based on that result, but since all this would only make a difference for operations that consume between 0.5x and 1x the user's RAM (which to me seems like a very narrow window, on the log scale), is it really worth the trouble? --Tim For what I was doing, yes, it was definitely worth the trouble, because you'd have systems with 10s of thousands of processes (the limit was 64K on a single node), and you had to be very careful about not using up too much memory, and ending up thrashing... Very different than when you maybe have a process for each core, and you have lots of memory for each one... Different usage... different performance issues...
Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?
On Friday, May 1, 2015 at 12:42:57 PM UTC-4, Steven G. Johnson wrote: On Thursday, April 30, 2015 at 6:10:58 PM UTC-4, Scott Jones wrote: Yes... Python will win on string processing... esp. with Python 3... I quickly ran into things that were 800x faster in Python... (I hope to help change that though!) The 800x faster example that you've referred to several times, if I recall correctly, is one where you repeatedly concatenate strings. In CPython, under certain circumstances, this is optimized to mutating one of the strings in-place and is consequently O(n) where n is the final length, although this is not guaranteed by the language itself. In Julia, Ruby, Java, Go, and many other languages, concatenation allocates a new string and hence building a string by repeated concatenation is O(n^2). That doesn't mean that those other languages lose on string processing to Python, it just means that you have to do things slightly differently (e.g. write to an IOBuffer in Julia). I just don't think that IOBuffers are a very good way to do that... what I really need are mutable strings... and I know there is a package, and I need to investigate that further... it's something that would be nice to have as part of the core of the language, instead of having to use either Vectors or IOBuffers... As a new users, I would think, if I'm not doing IO, why should be using an IOBuffer... You can't always expect the *same code* (translated as literally as possible) to be the optimal approach in different languages, and it is inflammatory to compare languages according to this standard. I was not intending to be inflammatory, just relating what my first experience was, which let me to investigate much more deeply, into the good and bad issues in Julia wrt performance (more good than bad, by a long shot). A fairer question is whether it is *much harder* to get good performance in one language vs. another for a certain task. There will certainly be tasks where Python is still superior in this sense simply because there are many cases where Python calls highly tuned C libraries for operations that have not been as optimized in Julia. Julia will tend to shine the further you stray from built-in operations in your performance-critical code. Yes, that is true... and that is why I'm betting on Julia in the long run (the other option for a lot of the code would have been Python or C++11, and I've already found Julia easier to deal with than either of them, even in it's pre 1.0 state)
Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?
On Friday, May 1, 2015 at 12:38:33 PM UTC-4, Jeff Bezanson wrote: Steven -- I agree and I find it very refreshing that you're willing to judge a language by more than just performance. Any given language can always be optimized better, so ideally you want to compare them by more robust criteria. Obviously a particular system might have a well-tuned library routine that's faster than our equivalent. But think about it: is having a slow interpreter, and relying on code to spend all its time in pre-baked library kernels the *right* way to get performance? That's just the same boring design that has been used over and over again, in matlab, IDL, octave, R, etc. In those cases the language isn't bringing much to the table, except a pile of rules about how important code must still be written in C/Fortran, and how your code must be vectorized or shame on you. That's a very good point... and is one of the things I like a lot about Julia... Even with my initial surprise about a single performance issue (the building up a string by concatenation), I did NOT judge Julia by that alone, and have been quite happy with it overall [and I've been converting all of the developers at the startup where I'm consulting to Julia fans]. I also have faith, from what I've seen so far, is that performance issues *will* be addressed, as best as possible considering the architecture and goals of the language, by a number of pretty smart people, both in and outside of the core team. Scott
Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?
On Friday, May 1, 2015 at 7:23:40 PM UTC+2, Steven G. Johnson wrote: On Friday, May 1, 2015 at 1:12:00 PM UTC-4, Steven Sagaert wrote: That wasn't what I was saying. I like the philosophy behind julia. But in practice (as of now) even in julia you still have to code in a certain style if you want very good performance and that's no different than in any other language. The goal of Julia is not to be a language in which it is *impossible* to write slow code, or a language in which all programming styles are equally fast. I didn't say that was a goal of Julia but it sure would be nice to have though :) but probably an impossible dream. The goal (or at least, one of the goals) is to be an expressive, high-level dynamic language, in which it is also *possible* to write performance-critical inner-loop code. That *is* different from other high-level languages, in which it is typically *not* possible to write performance-critical inner-loop code without dropping down to a lower-level language (C, Fortran, Cython...). If you are coding exclusively in Python or R, and there isn't an optimized function appropriate for the innermost loops of your task at hand, you are out of luck. like I said: I like Julia and I am rooting for it but just to play devil's advocate: I believe it's also a goal ( possibility) of numba to write c-level efficient code in Python. All you have to do add an annotation here and there.
Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?
I believe that both are actually very similar in that manner. I think the main difference comes from the fact that Julia is an attempt to design the core library to support and use the efficient constructs, while Numba and other related projects are, for better or worse, inheriting the default python semantics and built-in libraries. Sometimes a new language is better than an old language simply because it can drop compatibility concerns. For example, Java is known for providing far more consistent multi-threading support than C, since it is a language construct and not an add-on feature. It was possible in both, one just made it easier for the programmer to access. Similarly, Node made it feasible to write programs without any concept of a blocking operation. Again, this was already possible in languages like Python and C, but Node (with it's legacy in Javascript), made it a feature of the language and designed all of the core API's to deal with it. On Fri, May 1, 2015 at 2:27 PM Steven G. Johnson stevenj@gmail.com wrote: On Friday, May 1, 2015 at 2:04:44 PM UTC-4, Steven Sagaert wrote: like I said: I like Julia and I am rooting for it but just to play devil's advocate: I believe it's also a goal ( possibility) of numba to write c-level efficient code in Python. All you have to do add an annotation here and there. Numba is arguably a 2nd lower-level language that happens to be embedded in Python — it is telling that Numba's documentation explicitly states that it can only get good performance when it is able to JIT the inner loops in nopython mode — basically, code that doesn't stray outside a small set of types.
Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?
On Friday, May 1, 2015 at 2:04:44 PM UTC-4, Steven Sagaert wrote: like I said: I like Julia and I am rooting for it but just to play devil's advocate: I believe it's also a goal ( possibility) of numba to write c-level efficient code in Python. All you have to do add an annotation here and there. Numba is arguably a 2nd lower-level language that happens to be embedded in Python — it is telling that Numba's documentation explicitly states that it can only get good performance when it is able to JIT the inner loops in nopython mode — basically, code that doesn't stray outside a small set of types.
Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?
On Friday, May 1, 2015 at 1:25:41 AM UTC-4, Jeff Bezanson wrote: It is true that we have not yet done enough to optimize the worst and worse performance cases. The bright side of that is that we have room to improve; it's not that we've run out of ideas and techniques. Tim is right that the complexity of our dispatch system makes julia potentially slower than python. But in dispatch-heavy code I've seen cases where we are faster or slower; it depends. Python's string and dictionary operations, in particular, are really fast. This is not surprising considering what the language was designed for, and that they have a big library of well-tuned C code for these things. I still maintain that it is misleading to describe an *asymptotic* slowdown as 800x slower. If you name a constant factor, it sounds like you're talking about a constant factor slowdown. But the number is arbitrary, because it depends on data size. In theory, of course, an asymptotic slowdown is *much worse* than a constant factor slowdown. However in the systems world constant factors are often more important, and are often what we talk about. No, that was just my very first test comparing Julia Python, using a size that matched the record sizes I'd typically seen from way too many years of benchmarking (database / string processing operations) You say a lot of the algorithms are O(n) instead of O(1). Are there any examples other than length()? Actually, it's worse than that... length, and getting finding a particular character by character position, and getting a substring by character position, some of the most frequent operations for what I deal with, are O(n) instead of O(1), and things like conversions are O(n^2), not O(n) [and the conversions are much more complex, due to the string representation in Julia, unlike Python 3]. The conversions I am fixing, so that they are not O(n^2), but rather O(n) [slower than Python, again because of the representation, but not asymptotic]. The reason they are O(n^2), like the string concatenation problem I ran into right when I first started to evaluate Julia, is because of the way the conversion functions are written, initially creating a 0-length array, and then doing push! to successively add characters to the array, and then finally calling UTF8String, UTF16String, or UTF32String to convert the Vector{UInt8}, Vector{UInt16} or Vector{Char} respectively into an immutable string. As the string grows, Julia's internals end up having to reallocate the memory and sometimes copy it to a new location, hence the O(n^2) nature of the code. My changes, which hopefully will be accepted (after I check in my next round of pure Julia optimizations), solve that by first validating the input UTF-8, UTF-16, or UTF-32 string at the same time as calculating how many characters of the different ranges are present, so that the memory can be allocated once, exactly the size needed, and also frequently allowing dispatching to simpler conversion code, when it is know that all of the characters in the string just need to be widened (zero-extended), or narrowed. I disagree that UTF-8 has no space savings over UTF-32 when using the full range of unicode. The reason is that strings often have only a small percentage of non-BMP characters, with lots of spaces and newlines etc. You don't want your whole file to use 4x the space just to use one emoji. Please read my statement more carefully... UTF-8 *can* take up to 50% more storage than UTF-16 if you are just dealing with BMP characters. If you have some field that needs to hold *a certain number of Unicode characters*, for the full range of Unicode, you need to allocate 4 bytes for every character, so no savings compared to UTF-16 or UTF-32. My point was that if you have to allocate a buffer to hold a certain # of characters, say because you have a CHAR, NCHAR, or WCHAR, or VARCHAR, etc. field from a DBMS, for UTF-8, you need to allocate at least 4 bytes per character, so no savings over UTF-16 or UTF-32 for those operations... I spent over two years going back and forth to Japan, when I designed (and was the main implementor) for the Unicode support of a database system / language, and spent a lot of time looking at the just how much storage space different representations would take... Note, at that time, Unicode 2.0 was not out, so the choice was between UCS-2 (no surrogates then), UTF-8, some combination thereof, or some new encoding. My first version, released finally in 1997, used either 8-bit (ANSI Latin 1) or UCS-2 to store data... The next release, I came up with a new encoding for Unicode, that was much more compact (at the insistence of the Japanese customers, who didn't want their storage requirements to increase because of moving from SJIS and EUC to Unicode). In memory, all strings were UCS-2 (or really UTF-16, but like Java, because I designed it
Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?
I think the performance comparisons between Julia Python are flawed. They seem to be between standard Python Julia but since Julia is all about scientific programming it really should be between SciPi Julia. Since SciPi uses much of the same underlying libs in Fortran/C the performance gap will be much smaller and to be really fair it should be between numba compiled SciPi code julia. I suspect the performance will be very close then (and close to C performance). Similarly the standard benchmark (on the opening page of julia website) between R julia is also flawed because it takes the best case scenario for julia (loops mutable datastructures) the worst case scenario for R. When the same R program is rewritten in vectorised style it beat julia see https://matloff.wordpress.com/2014/05/21/r-beats-python-r-beats-julia-anyone-else-wanna-challenge-r/. So my interest in julia isn't because it is the fastest scientific high level language (because clearly at this stage you can't really claim that) but because it's a clean interesting language (still needs work for some rough edges of course) with clean(er) clear(er) libraries and that gives reasonable performance out of the box without much tweaking. On Friday, May 1, 2015 at 12:10:58 AM UTC+2, Scott Jones wrote: Yes... Python will win on string processing... esp. with Python 3... I quickly ran into things that were 800x faster in Python... (I hope to help change that though!) Scott On Thursday, April 30, 2015 at 6:01:45 PM UTC-4, Páll Haraldsson wrote: I wouldn't expect a difference in Julia for code like that (didn't check). But I guess what we are often seeing is someone comparing a tuned Python code to newbie Julia code. I still want it faster than that code.. (assuming same algorithm, note row vs. column major caveat). The main point of mine, *should* Python at any time win? 2015-04-30 21:36 GMT+00:00 Sisyphuss zhengw...@gmail.com: This post interests me. I'll write something here to follow this post. The performance gap between normal code in Python and badly-written code in Julia is something I'd like to know too. As far as I know, Python interpret does some mysterious optimizations. For example `(x**2)**2` is 100x faster than `x**4`. On Thursday, April 30, 2015 at 9:58:35 PM UTC+2, Páll Haraldsson wrote: Hi, [As a best language is subjective, I'll put that aside for a moment.] Part I. The goal, as I understand, for Julia is at least within a factor of two of C and already matching it mostly and long term beating that (and C++). [What other goals are there? How about 0.4 now or even 1.0..?] While that is the goal as a language, you can write slow code in any language and Julia makes that easier. :) [If I recall, Bezanson mentioned it (the global problem) as a feature, any change there?] I've been following this forum for months and newbies hit the same issues. But almost always without fail, Julia can be speed up (easily as Tim Holy says). I'm thinking about the exceptions to that - are there any left? And about the first code slowness (see Part II). Just recently the last two flaws of Julia that I could see where fixed: Decimal floating point is in (I'll look into the 100x slowness, that is probably to be expected of any language, still I think may be a misunderstanding and/or I can do much better). And I understand the tuple slowness has been fixed (that was really the only core language defect). The former wasn't a performance problem (mostly a non existence problem and correctness one (where needed)..). Still we see threads like this one recent one: https://groups.google.com/forum/#!topic/julia-users/-bx9xIfsHHw It seems changing the order of nested loops also helps Obviously Julia can't beat assembly but really C/Fortran is already close enough (within a small factor). The above row vs. column major (caching effects in general) can kill performance in all languages. Putting that newbie mistake aside, is there any reason Julia can be within a small factor of assembly (or C) in all cases already? Part II. Except for caching issues, I still want the most newbie code or intentionally brain-damaged code to run faster than at least Python/scripting/interpreted languages. Potential problems (that I think are solved or at least not problems in theory): 1. I know Any kills performance. Still, isn't that the default in Python (and Ruby, Perl?)? Is there a good reason Julia can't be faster than at least all the so-called scripting languages in all cases (excluding small startup overhead, see below)? 2. The global issue, not sure if that slows other languages down, say Python. Even if it doesn't, should Julia be slower than Python because of global? 3. Garbage collection. I do not see that as a problem, incorrect? Mostly performance variability ([3D] games - subject for another post, as I'm not sure
Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?
On Friday, May 1, 2015 at 4:25:50 AM UTC-4, Steven Sagaert wrote: I think the performance comparisons between Julia Python are flawed. They seem to be between standard Python Julia but since Julia is all about scientific programming it really should be between SciPi Julia. Since SciPi uses much of the same underlying libs in Fortran/C the performance gap will be much smaller and to be really fair it should be between numba compiled SciPi code julia. I suspect the performance will be very close then (and close to C performance). Why should Julia be limited to scientific programming? I think it can be a great language for general programming, for the most part, I think it already is (it can use some changes for string handling [I'd like to work on that ;-)], decimal floating point support [that is currently being addressed, kudos to Steven G. Johnson], maybe some better language constructs to allow better software engineering practices [that is being hotly debated!], and definitely a real debugger [I think keno is working on that]). Comparing Julia to Python for general computing is totally valid and interesting. Comparing Julia to SciPy for scientific computing is also totally valid and interesting. Similarly the standard benchmark (on the opening page of julia website) between R julia is also flawed because it takes the best case scenario for julia (loops mutable datastructures) the worst case scenario for R. When the same R program is rewritten in vectorised style it beat julia see https://matloff.wordpress.com/2014/05/21/r-beats-python-r-beats-julia-anyone-else-wanna-challenge-r/ . So my interest in julia isn't because it is the fastest scientific high level language (because clearly at this stage you can't really claim that) but because it's a clean interesting language (still needs work for some rough edges of course) with clean(er) clear(er) libraries and that gives reasonable performance out of the box without much tweaking.
Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?
Don't apologize; instead, tell us more about what Go does, and how you think things can be better. Those of us who don't know Go will thank you for it. Best, --Tim On Thursday, April 30, 2015 09:42:47 PM Harry B wrote: Sorry my comment wasn't well thought out and a bit off topic. On exceptions/errors my issue is this https://github.com/JuliaLang/julia/issues/7026 On profiling, I was comparing to Go, but again off topic and I take my comment back. I don't have any intelligent remarks to add (yet!) :) Thank you for the all the work you are doing. On Thursday, April 30, 2015 at 7:00:01 PM UTC-7, Tim Holy wrote: Harry, I'm curious about 2 of your 3 last points: On Thursday, April 30, 2015 05:50:15 PM Harry B wrote: (exceptions?, debugging, profiling tools) We have exceptions. What aspect are you referring to? Debugger: yes, that's missing, and it's a huge gap. Profiling tools: in my view we're doing OK (better than Matlab, in my opinion), but what do you see as missing? --Tim Thanks It seemed to me tuples where slow because of Any used. I understand tuples have been fixed, I'm not sure how. I do not remember the post/all the details. Yes, tuples where slow/er than Python. Maybe it was Dict, isn't that kind of a tuple? Now we have Pair in 0.4. I do not have 0.4, maybe I should bite the bullet and install.. I'm not doing anything production related and trying things out and using 0.3[.5] to avoid stability problems.. Then I can't judge the speed.. Another potential issue I saw with tuples (maybe that is not a problem in general, and I do not know that languages do this) is that they can take a lot of memory (to copy around). I was thinking, maybe they should do similar to databases, only use a fixed amount of memory (a page) with a pointer to overflow data.. 2015-04-30 22:13 GMT+00:00 Ali Rezaee arv@gmail.com javascript:: They were interesting questions. I would also like to know why poorly written Julia code sometimes performs worse than similar python code, especially when tuples are involved. Did you say it was fixed? On Thursday, April 30, 2015 at 9:58:35 PM UTC+2, Páll Haraldsson wrote: Hi, [As a best language is subjective, I'll put that aside for a moment.] Part I. The goal, as I understand, for Julia is at least within a factor of two of C and already matching it mostly and long term beating that (and C++). [What other goals are there? How about 0.4 now or even 1.0..?] While that is the goal as a language, you can write slow code in any language and Julia makes that easier. :) [If I recall, Bezanson mentioned it (the global problem) as a feature, any change there?] I've been following this forum for months and newbies hit the same issues. But almost always without fail, Julia can be speed up (easily as Tim Holy says). I'm thinking about the exceptions to that - are there any left? And about the first code slowness (see Part II). Just recently the last two flaws of Julia that I could see where fixed: Decimal floating point is in (I'll look into the 100x slowness, that is probably to be expected of any language, still I think may be a misunderstanding and/or I can do much better). And I understand the tuple slowness has been fixed (that was really the only core language defect). The former wasn't a performance problem (mostly a non existence problem and correctness one (where needed)..). Still we see threads like this one recent one: https://groups.google.com/forum/#!topic/julia-users/-bx9xIfsHHw It seems changing the order of nested loops also helps Obviously Julia can't beat assembly but really C/Fortran is already close enough (within a small factor). The above row vs. column major (caching effects in general) can kill performance in all languages. Putting that newbie mistake aside, is there any reason Julia can be within a small factor of assembly (or C) in all cases already? Part II. Except for caching issues, I still want the most newbie code or intentionally brain-damaged code to run faster than at least Python/scripting/interpreted languages. Potential problems (that I think are solved or at least not problems in theory): 1. I know Any kills performance. Still, isn't that the default in Python (and Ruby, Perl?)? Is there a good reason Julia can't be faster than at least all the so-called scripting languages in all cases (excluding small startup overhead, see below)? 2. The global issue, not sure if that slows other languages down, say
Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?
I just read through all of that very interesting thread on exceptions... it seems that Stefan was trying to reinvent the wheel, without knowing it. Everybody interested in exception handling should go look up CLU... Julia seems to have gotten a lot of ideas from CLU (possibly rather indirectly, through C++, Java, Lua...). CLU had this well handled 40 years ago ;-) Scott On Friday, May 1, 2015 at 12:42:47 AM UTC-4, Harry B wrote: Sorry my comment wasn't well thought out and a bit off topic. On exceptions/errors my issue is this https://github.com/JuliaLang/julia/issues/7026 On profiling, I was comparing to Go, but again off topic and I take my comment back. I don't have any intelligent remarks to add (yet!) :) Thank you for the all the work you are doing. On Thursday, April 30, 2015 at 7:00:01 PM UTC-7, Tim Holy wrote: Harry, I'm curious about 2 of your 3 last points: On Thursday, April 30, 2015 05:50:15 PM Harry B wrote: (exceptions?, debugging, profiling tools) We have exceptions. What aspect are you referring to? Debugger: yes, that's missing, and it's a huge gap. Profiling tools: in my view we're doing OK (better than Matlab, in my opinion), but what do you see as missing? --Tim Thanks -- Harry On Thursday, April 30, 2015 at 3:43:36 PM UTC-7, Páll Haraldsson wrote: It seemed to me tuples where slow because of Any used. I understand tuples have been fixed, I'm not sure how. I do not remember the post/all the details. Yes, tuples where slow/er than Python. Maybe it was Dict, isn't that kind of a tuple? Now we have Pair in 0.4. I do not have 0.4, maybe I should bite the bullet and install.. I'm not doing anything production related and trying things out and using 0.3[.5] to avoid stability problems.. Then I can't judge the speed.. Another potential issue I saw with tuples (maybe that is not a problem in general, and I do not know that languages do this) is that they can take a lot of memory (to copy around). I was thinking, maybe they should do similar to databases, only use a fixed amount of memory (a page) with a pointer to overflow data.. 2015-04-30 22:13 GMT+00:00 Ali Rezaee arv@gmail.com javascript:: They were interesting questions. I would also like to know why poorly written Julia code sometimes performs worse than similar python code, especially when tuples are involved. Did you say it was fixed? On Thursday, April 30, 2015 at 9:58:35 PM UTC+2, Páll Haraldsson wrote: Hi, [As a best language is subjective, I'll put that aside for a moment.] Part I. The goal, as I understand, for Julia is at least within a factor of two of C and already matching it mostly and long term beating that (and C++). [What other goals are there? How about 0.4 now or even 1.0..?] While that is the goal as a language, you can write slow code in any language and Julia makes that easier. :) [If I recall, Bezanson mentioned it (the global problem) as a feature, any change there?] I've been following this forum for months and newbies hit the same issues. But almost always without fail, Julia can be speed up (easily as Tim Holy says). I'm thinking about the exceptions to that - are there any left? And about the first code slowness (see Part II). Just recently the last two flaws of Julia that I could see where fixed: Decimal floating point is in (I'll look into the 100x slowness, that is probably to be expected of any language, still I think may be a misunderstanding and/or I can do much better). And I understand the tuple slowness has been fixed (that was really the only core language defect). The former wasn't a performance problem (mostly a non existence problem and correctness one (where needed)..). Still we see threads like this one recent one: https://groups.google.com/forum/#!topic/julia-users/-bx9xIfsHHw It seems changing the order of nested loops also helps Obviously Julia can't beat assembly but really C/Fortran is already close enough (within a small factor). The above row vs. column major (caching effects in general) can kill performance in all languages. Putting that newbie mistake aside, is there any reason Julia can be within a small factor of assembly (or C) in all cases already? Part II. Except for caching issues, I still want the most newbie code or intentionally brain-damaged code to run faster than at least Python/scripting/interpreted languages. Potential problems (that I think are solved or at least not problems in theory): 1. I know Any kills performance. Still, isn't that the default in Python (and Ruby, Perl?)? Is there a good reason Julia can't be faster than at least all
Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?
On Friday, May 1, 2015 at 12:26:54 PM UTC+2, Scott Jones wrote: On Friday, May 1, 2015 at 4:25:50 AM UTC-4, Steven Sagaert wrote: I think the performance comparisons between Julia Python are flawed. They seem to be between standard Python Julia but since Julia is all about scientific programming it really should be between SciPi Julia. Since SciPi uses much of the same underlying libs in Fortran/C the performance gap will be much smaller and to be really fair it should be between numba compiled SciPi code julia. I suspect the performance will be very close then (and close to C performance). Why should Julia be limited to scientific programming? I think it can be a great language for general programming, I agree but for now the short time future I think the core domain of julia is scientific computing/data science and so to have fair comparisons one should not just compare julia to vanilla Python but especially scipi numba. for the most part, I think it already is (it can use some changes for string handling [I'd like to work on that ;-)], decimal floating point support [that is currently being addressed, kudos to Steven G. Johnson], maybe some better language constructs to allow better software engineering practices [that is being hotly debated!], and definitely a real debugger [I think keno is working on that]). Comparing Julia to Python for general computing is totally valid and interesting. Comparing Julia to SciPy for scientific computing is also totally valid and interesting. Similarly the standard benchmark (on the opening page of julia website) between R julia is also flawed because it takes the best case scenario for julia (loops mutable datastructures) the worst case scenario for R. When the same R program is rewritten in vectorised style it beat julia see https://matloff.wordpress.com/2014/05/21/r-beats-python-r-beats-julia-anyone-else-wanna-challenge-r/ . So my interest in julia isn't because it is the fastest scientific high level language (because clearly at this stage you can't really claim that) but because it's a clean interesting language (still needs work for some rough edges of course) with clean(er) clear(er) libraries and that gives reasonable performance out of the box without much tweaking.
Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?
On Friday, May 1, 2015 at 3:25:50 AM UTC-5, Steven Sagaert wrote: I think the performance comparisons between Julia Python are flawed. They seem to be between standard Python Julia but since Julia is all about scientific programming it really should be between SciPi Julia. Since SciPi uses much of the same underlying libs in Fortran/C the performance gap will be much smaller and to be really fair it should be between numba compiled SciPi code julia. I suspect the performance will be very close then (and close to C performance). Similarly the standard benchmark (on the opening page of julia website) between R julia is also flawed because it takes the best case scenario for julia (loops mutable datastructures) the worst case scenario for R. When the same R program is rewritten in vectorised style it beat julia see https://matloff.wordpress.com/2014/05/21/r-beats-python-r-beats-julia-anyone-else-wanna-challenge-r/ . All benchmarks are flawed in that sense--a single benchmark can't tell you everything. The Julia performance benchmarks are testing algorithms expressed in the langauges themselves. It is not a test of foreign-function interfaces and BLAS implementations, so the benchmarks don't test that. This has been discussed at length--as one example, see https://github.com/JuliaLang/julia/issues/2412.
Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?
Hi Steven, I understand your point---you're saying you'd be unlikely to write those algorithms in that manner, if your goal were to do those particular computations. But the important point to keep in mind is that those benchmarks are simply toys for the purpose of testing performance of various language constructs. If you think it's irrelevant to benchmark loops for scientific code, then you do very, very different stuff than me. Not all algorithms reduce to BLAS calls. I use julia to write all kinds of algorithms that I used to write MEX functions for, back in my Matlab days. If all you need is A*b, then of course basically any scientific language will be just fine, with minimal differences in performance. Moreover, that R benchmark on cumsum is simply not credible. I'm not sure what was happening (and that article doesn't post its code or procedures used to test), but julia's cumsum reduces to efficient machine code (basically, a bunch of addition operations). If they were computing cumsum across a specific dimension, then this PR: https://github.com/JuliaLang/julia/pull/7359 changed things. But more likely, someone forgot to run the code twice (so it got JIT-compiled), had a type-instability in the code they were testing, or some other mistake. It's too bad one can make mistakes, of course, but then it becomes a comparison of different programmers rather than different programming languages. Indeed, if you read the comments in that post, Stefan already rebutted that benchmark, with a 4x advantage for Julia: https://matloff.wordpress.com/2014/05/21/r-beats-python-r-beats-julia-anyone-else-wanna-challenge-r/comment-page-1/#comment-89 --Tim On Friday, May 01, 2015 01:25:50 AM Steven Sagaert wrote: I think the performance comparisons between Julia Python are flawed. They seem to be between standard Python Julia but since Julia is all about scientific programming it really should be between SciPi Julia. Since SciPi uses much of the same underlying libs in Fortran/C the performance gap will be much smaller and to be really fair it should be between numba compiled SciPi code julia. I suspect the performance will be very close then (and close to C performance). Similarly the standard benchmark (on the opening page of julia website) between R julia is also flawed because it takes the best case scenario for julia (loops mutable datastructures) the worst case scenario for R. When the same R program is rewritten in vectorised style it beat julia see https://matloff.wordpress.com/2014/05/21/r-beats-python-r-beats-julia-anyon e-else-wanna-challenge-r/. So my interest in julia isn't because it is the fastest scientific high level language (because clearly at this stage you can't really claim that) but because it's a clean interesting language (still needs work for some rough edges of course) with clean(er) clear(er) libraries and that gives reasonable performance out of the box without much tweaking. On Friday, May 1, 2015 at 12:10:58 AM UTC+2, Scott Jones wrote: Yes... Python will win on string processing... esp. with Python 3... I quickly ran into things that were 800x faster in Python... (I hope to help change that though!) Scott On Thursday, April 30, 2015 at 6:01:45 PM UTC-4, Páll Haraldsson wrote: I wouldn't expect a difference in Julia for code like that (didn't check). But I guess what we are often seeing is someone comparing a tuned Python code to newbie Julia code. I still want it faster than that code.. (assuming same algorithm, note row vs. column major caveat). The main point of mine, *should* Python at any time win? 2015-04-30 21:36 GMT+00:00 Sisyphuss zhengw...@gmail.com: This post interests me. I'll write something here to follow this post. The performance gap between normal code in Python and badly-written code in Julia is something I'd like to know too. As far as I know, Python interpret does some mysterious optimizations. For example `(x**2)**2` is 100x faster than `x**4`. On Thursday, April 30, 2015 at 9:58:35 PM UTC+2, Páll Haraldsson wrote: Hi, [As a best language is subjective, I'll put that aside for a moment.] Part I. The goal, as I understand, for Julia is at least within a factor of two of C and already matching it mostly and long term beating that (and C++). [What other goals are there? How about 0.4 now or even 1.0..?] While that is the goal as a language, you can write slow code in any language and Julia makes that easier. :) [If I recall, Bezanson mentioned it (the global problem) as a feature, any change there?] I've been following this forum for months and newbies hit the same issues. But almost always without fail, Julia can be speed up (easily as Tim Holy says). I'm thinking about the exceptions to that - are there any left? And about the first code slowness (see Part II). Just recently the last two flaws of Julia
Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?
On Friday, May 01, 2015 03:19:03 AM Scott Jones wrote: As the string grows, Julia's internals end up having to reallocate the memory and sometimes copy it to a new location, hence the O(n^2) nature of the code. Small correction: push! is not O(n^2), it's O(nlogn). Internally, the storage array grows by factors of 2 [1]; after one allocation of size 2n you can add n more elements without reallocating. That said, O(nlogn) can be pretty easily beat by O(2n): make one pass through and count how many you'll need, allocate the whole thing, and then stuff in elements. As you seem to be planning to do. --Tim [1] Last I looked, that is; there was some discussion about switching it to something like 1.5 because of various discussions of memory fragmentation and reuse.
Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?
Sorry my comment wasn't well thought out and a bit off topic. On exceptions/errors my issue is this https://github.com/JuliaLang/julia/issues/7026 On profiling, I was comparing to Go, but again off topic and I take my comment back. I don't have any intelligent remarks to add (yet!) :) Thank you for the all the work you are doing. On Thursday, April 30, 2015 at 7:00:01 PM UTC-7, Tim Holy wrote: Harry, I'm curious about 2 of your 3 last points: On Thursday, April 30, 2015 05:50:15 PM Harry B wrote: (exceptions?, debugging, profiling tools) We have exceptions. What aspect are you referring to? Debugger: yes, that's missing, and it's a huge gap. Profiling tools: in my view we're doing OK (better than Matlab, in my opinion), but what do you see as missing? --Tim Thanks -- Harry On Thursday, April 30, 2015 at 3:43:36 PM UTC-7, Páll Haraldsson wrote: It seemed to me tuples where slow because of Any used. I understand tuples have been fixed, I'm not sure how. I do not remember the post/all the details. Yes, tuples where slow/er than Python. Maybe it was Dict, isn't that kind of a tuple? Now we have Pair in 0.4. I do not have 0.4, maybe I should bite the bullet and install.. I'm not doing anything production related and trying things out and using 0.3[.5] to avoid stability problems.. Then I can't judge the speed.. Another potential issue I saw with tuples (maybe that is not a problem in general, and I do not know that languages do this) is that they can take a lot of memory (to copy around). I was thinking, maybe they should do similar to databases, only use a fixed amount of memory (a page) with a pointer to overflow data.. 2015-04-30 22:13 GMT+00:00 Ali Rezaee arv@gmail.com javascript:: They were interesting questions. I would also like to know why poorly written Julia code sometimes performs worse than similar python code, especially when tuples are involved. Did you say it was fixed? On Thursday, April 30, 2015 at 9:58:35 PM UTC+2, Páll Haraldsson wrote: Hi, [As a best language is subjective, I'll put that aside for a moment.] Part I. The goal, as I understand, for Julia is at least within a factor of two of C and already matching it mostly and long term beating that (and C++). [What other goals are there? How about 0.4 now or even 1.0..?] While that is the goal as a language, you can write slow code in any language and Julia makes that easier. :) [If I recall, Bezanson mentioned it (the global problem) as a feature, any change there?] I've been following this forum for months and newbies hit the same issues. But almost always without fail, Julia can be speed up (easily as Tim Holy says). I'm thinking about the exceptions to that - are there any left? And about the first code slowness (see Part II). Just recently the last two flaws of Julia that I could see where fixed: Decimal floating point is in (I'll look into the 100x slowness, that is probably to be expected of any language, still I think may be a misunderstanding and/or I can do much better). And I understand the tuple slowness has been fixed (that was really the only core language defect). The former wasn't a performance problem (mostly a non existence problem and correctness one (where needed)..). Still we see threads like this one recent one: https://groups.google.com/forum/#!topic/julia-users/-bx9xIfsHHw It seems changing the order of nested loops also helps Obviously Julia can't beat assembly but really C/Fortran is already close enough (within a small factor). The above row vs. column major (caching effects in general) can kill performance in all languages. Putting that newbie mistake aside, is there any reason Julia can be within a small factor of assembly (or C) in all cases already? Part II. Except for caching issues, I still want the most newbie code or intentionally brain-damaged code to run faster than at least Python/scripting/interpreted languages. Potential problems (that I think are solved or at least not problems in theory): 1. I know Any kills performance. Still, isn't that the default in Python (and Ruby, Perl?)? Is there a good reason Julia can't be faster than at least all the so-called scripting languages in all cases (excluding small startup overhead, see below)? 2. The global issue, not sure if that slows other languages down, say Python. Even if it doesn't, should Julia be slower than Python because of global? 3. Garbage collection. I do not see that as a problem, incorrect? Mostly performance variability ([3D] games - subject for another post, as I'm not sure
Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?
It is true that we have not yet done enough to optimize the worst and worse performance cases. The bright side of that is that we have room to improve; it's not that we've run out of ideas and techniques. Tim is right that the complexity of our dispatch system makes julia potentially slower than python. But in dispatch-heavy code I've seen cases where we are faster or slower; it depends. Python's string and dictionary operations, in particular, are really fast. This is not surprising considering what the language was designed for, and that they have a big library of well-tuned C code for these things. I still maintain that it is misleading to describe an *asymptotic* slowdown as 800x slower. If you name a constant factor, it sounds like you're talking about a constant factor slowdown. But the number is arbitrary, because it depends on data size. In theory, of course, an asymptotic slowdown is *much worse* than a constant factor slowdown. However in the systems world constant factors are often more important, and are often what we talk about. You say a lot of the algorithms are O(n) instead of O(1). Are there any examples other than length()? I disagree that UTF-8 has no space savings over UTF-32 when using the full range of unicode. The reason is that strings often have only a small percentage of non-BMP characters, with lots of spaces and newlines etc. You don't want your whole file to use 4x the space just to use one emoji. On Fri, May 1, 2015 at 12:42 AM, Harry B harrysun...@gmail.com wrote: Sorry my comment wasn't well thought out and a bit off topic. On exceptions/errors my issue is this https://github.com/JuliaLang/julia/issues/7026 On profiling, I was comparing to Go, but again off topic and I take my comment back. I don't have any intelligent remarks to add (yet!) :) Thank you for the all the work you are doing. On Thursday, April 30, 2015 at 7:00:01 PM UTC-7, Tim Holy wrote: Harry, I'm curious about 2 of your 3 last points: On Thursday, April 30, 2015 05:50:15 PM Harry B wrote: (exceptions?, debugging, profiling tools) We have exceptions. What aspect are you referring to? Debugger: yes, that's missing, and it's a huge gap. Profiling tools: in my view we're doing OK (better than Matlab, in my opinion), but what do you see as missing? --Tim Thanks -- Harry On Thursday, April 30, 2015 at 3:43:36 PM UTC-7, Páll Haraldsson wrote: It seemed to me tuples where slow because of Any used. I understand tuples have been fixed, I'm not sure how. I do not remember the post/all the details. Yes, tuples where slow/er than Python. Maybe it was Dict, isn't that kind of a tuple? Now we have Pair in 0.4. I do not have 0.4, maybe I should bite the bullet and install.. I'm not doing anything production related and trying things out and using 0.3[.5] to avoid stability problems.. Then I can't judge the speed.. Another potential issue I saw with tuples (maybe that is not a problem in general, and I do not know that languages do this) is that they can take a lot of memory (to copy around). I was thinking, maybe they should do similar to databases, only use a fixed amount of memory (a page) with a pointer to overflow data.. 2015-04-30 22:13 GMT+00:00 Ali Rezaee arv@gmail.com javascript:: They were interesting questions. I would also like to know why poorly written Julia code sometimes performs worse than similar python code, especially when tuples are involved. Did you say it was fixed? On Thursday, April 30, 2015 at 9:58:35 PM UTC+2, Páll Haraldsson wrote: Hi, [As a best language is subjective, I'll put that aside for a moment.] Part I. The goal, as I understand, for Julia is at least within a factor of two of C and already matching it mostly and long term beating that (and C++). [What other goals are there? How about 0.4 now or even 1.0..?] While that is the goal as a language, you can write slow code in any language and Julia makes that easier. :) [If I recall, Bezanson mentioned it (the global problem) as a feature, any change there?] I've been following this forum for months and newbies hit the same issues. But almost always without fail, Julia can be speed up (easily as Tim Holy says). I'm thinking about the exceptions to that - are there any left? And about the first code slowness (see Part II). Just recently the last two flaws of Julia that I could see where fixed: Decimal floating point is in (I'll look into the 100x slowness, that is probably to be expected of any language, still I think may be a misunderstanding and/or I can do much better). And I understand the tuple slowness has been fixed (that was really the only core language defect). The former wasn't a performance problem (mostly a non existence problem and correctness one (where needed)..).
Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?
I wouldn't expect a difference in Julia for code like that (didn't check). But I guess what we are often seeing is someone comparing a tuned Python code to newbie Julia code. I still want it faster than that code.. (assuming same algorithm, note row vs. column major caveat). The main point of mine, *should* Python at any time win? 2015-04-30 21:36 GMT+00:00 Sisyphuss zhengwend...@gmail.com: This post interests me. I'll write something here to follow this post. The performance gap between normal code in Python and badly-written code in Julia is something I'd like to know too. As far as I know, Python interpret does some mysterious optimizations. For example `(x**2)**2` is 100x faster than `x**4`. On Thursday, April 30, 2015 at 9:58:35 PM UTC+2, Páll Haraldsson wrote: Hi, [As a best language is subjective, I'll put that aside for a moment.] Part I. The goal, as I understand, for Julia is at least within a factor of two of C and already matching it mostly and long term beating that (and C++). [What other goals are there? How about 0.4 now or even 1.0..?] While that is the goal as a language, you can write slow code in any language and Julia makes that easier. :) [If I recall, Bezanson mentioned it (the global problem) as a feature, any change there?] I've been following this forum for months and newbies hit the same issues. But almost always without fail, Julia can be speed up (easily as Tim Holy says). I'm thinking about the exceptions to that - are there any left? And about the first code slowness (see Part II). Just recently the last two flaws of Julia that I could see where fixed: Decimal floating point is in (I'll look into the 100x slowness, that is probably to be expected of any language, still I think may be a misunderstanding and/or I can do much better). And I understand the tuple slowness has been fixed (that was really the only core language defect). The former wasn't a performance problem (mostly a non existence problem and correctness one (where needed)..). Still we see threads like this one recent one: https://groups.google.com/forum/#!topic/julia-users/-bx9xIfsHHw It seems changing the order of nested loops also helps Obviously Julia can't beat assembly but really C/Fortran is already close enough (within a small factor). The above row vs. column major (caching effects in general) can kill performance in all languages. Putting that newbie mistake aside, is there any reason Julia can be within a small factor of assembly (or C) in all cases already? Part II. Except for caching issues, I still want the most newbie code or intentionally brain-damaged code to run faster than at least Python/scripting/interpreted languages. Potential problems (that I think are solved or at least not problems in theory): 1. I know Any kills performance. Still, isn't that the default in Python (and Ruby, Perl?)? Is there a good reason Julia can't be faster than at least all the so-called scripting languages in all cases (excluding small startup overhead, see below)? 2. The global issue, not sure if that slows other languages down, say Python. Even if it doesn't, should Julia be slower than Python because of global? 3. Garbage collection. I do not see that as a problem, incorrect? Mostly performance variability ([3D] games - subject for another post, as I'm not sure that is even a problem in theory..). Should reference counting (Python) be faster? On the contrary, I think RC and even manual memory management could be slower. 4. Concurrency, see nr. 3. GC may or may not have an issue with it. It can be a problem, what about in Julia? There are concurrent GC algorithms and/or real-time (just not in Julia). Other than GC is there any big (potential) problem for concurrent/parallel? I know about the threads work and new GC in 0.4. 5. Subarrays (array slicing?). Not really what I consider a problem, compared to say C (and Python?). I know 0.4 did optimize it, but what languages do similar stuff? Functional ones? 6. In theory, pure functional languages should be faster. Are they in practice in many or any case? Julia has non-mutable state if needed but maybe not as powerful? This seems a double-edged sword. I think Julia designers intentionally chose mutable state to conserve memory. Pros and cons? Mostly Pros for Julia? 7. Startup time. Python is faster and for say web use, or compared to PHP could be an issue, but would be solved by not doing CGI-style web. How good/fast is Julia/the libraries right now for say web use? At least for long running programs (intended target of Julia) startup time is not an issue. 8. MPI, do not know enough about it and parallel in general, seems you are doing a good job. I at least think there is no inherent limitation. At least Python is not in any way better for parallel/concurrent? 9. Autoparallel. Julia doesn't try to be, but could (be an addon?). Is anyone doing really good and
Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?
It seemed to me tuples where slow because of Any used. I understand tuples have been fixed, I'm not sure how. I do not remember the post/all the details. Yes, tuples where slow/er than Python. Maybe it was Dict, isn't that kind of a tuple? Now we have Pair in 0.4. I do not have 0.4, maybe I should bite the bullet and install.. I'm not doing anything production related and trying things out and using 0.3[.5] to avoid stability problems.. Then I can't judge the speed.. Another potential issue I saw with tuples (maybe that is not a problem in general, and I do not know that languages do this) is that they can take a lot of memory (to copy around). I was thinking, maybe they should do similar to databases, only use a fixed amount of memory (a page) with a pointer to overflow data.. 2015-04-30 22:13 GMT+00:00 Ali Rezaee arv.ka...@gmail.com: They were interesting questions. I would also like to know why poorly written Julia code sometimes performs worse than similar python code, especially when tuples are involved. Did you say it was fixed? On Thursday, April 30, 2015 at 9:58:35 PM UTC+2, Páll Haraldsson wrote: Hi, [As a best language is subjective, I'll put that aside for a moment.] Part I. The goal, as I understand, for Julia is at least within a factor of two of C and already matching it mostly and long term beating that (and C++). [What other goals are there? How about 0.4 now or even 1.0..?] While that is the goal as a language, you can write slow code in any language and Julia makes that easier. :) [If I recall, Bezanson mentioned it (the global problem) as a feature, any change there?] I've been following this forum for months and newbies hit the same issues. But almost always without fail, Julia can be speed up (easily as Tim Holy says). I'm thinking about the exceptions to that - are there any left? And about the first code slowness (see Part II). Just recently the last two flaws of Julia that I could see where fixed: Decimal floating point is in (I'll look into the 100x slowness, that is probably to be expected of any language, still I think may be a misunderstanding and/or I can do much better). And I understand the tuple slowness has been fixed (that was really the only core language defect). The former wasn't a performance problem (mostly a non existence problem and correctness one (where needed)..). Still we see threads like this one recent one: https://groups.google.com/forum/#!topic/julia-users/-bx9xIfsHHw It seems changing the order of nested loops also helps Obviously Julia can't beat assembly but really C/Fortran is already close enough (within a small factor). The above row vs. column major (caching effects in general) can kill performance in all languages. Putting that newbie mistake aside, is there any reason Julia can be within a small factor of assembly (or C) in all cases already? Part II. Except for caching issues, I still want the most newbie code or intentionally brain-damaged code to run faster than at least Python/scripting/interpreted languages. Potential problems (that I think are solved or at least not problems in theory): 1. I know Any kills performance. Still, isn't that the default in Python (and Ruby, Perl?)? Is there a good reason Julia can't be faster than at least all the so-called scripting languages in all cases (excluding small startup overhead, see below)? 2. The global issue, not sure if that slows other languages down, say Python. Even if it doesn't, should Julia be slower than Python because of global? 3. Garbage collection. I do not see that as a problem, incorrect? Mostly performance variability ([3D] games - subject for another post, as I'm not sure that is even a problem in theory..). Should reference counting (Python) be faster? On the contrary, I think RC and even manual memory management could be slower. 4. Concurrency, see nr. 3. GC may or may not have an issue with it. It can be a problem, what about in Julia? There are concurrent GC algorithms and/or real-time (just not in Julia). Other than GC is there any big (potential) problem for concurrent/parallel? I know about the threads work and new GC in 0.4. 5. Subarrays (array slicing?). Not really what I consider a problem, compared to say C (and Python?). I know 0.4 did optimize it, but what languages do similar stuff? Functional ones? 6. In theory, pure functional languages should be faster. Are they in practice in many or any case? Julia has non-mutable state if needed but maybe not as powerful? This seems a double-edged sword. I think Julia designers intentionally chose mutable state to conserve memory. Pros and cons? Mostly Pros for Julia? 7. Startup time. Python is faster and for say web use, or compared to PHP could be an issue, but would be solved by not doing CGI-style web. How good/fast is Julia/the libraries right now for say web use? At least for long running programs (intended target of Julia)
Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?
On Thursday, April 30, 2015 at 6:34:23 PM UTC-4, Páll Haraldsson wrote: Interesting.. does that mean Unicode then that is esp. faster or something else? 800x faster is way worse than I thought and no good reason for it.. That particular case is because CPython (which is the standard C implementation of Python, what most people mean when they use Python), has optimized the case of var += string which is appending to a variable. Although strings *are* immutable in Python, as in Julia, Python detects that you are replacing a string with the string concatenated with another, and if nobody else has a reference to the string in that variable, it can simply update the string in place, and otherwise, it makes a new string big enough for the result, and sets the variable to that new string. I'm really intrigued what is this slow, can't be the simple things like say just string concatenation?! You can get similar speed using PyCall.jl :) I'm not so sure... I don't really think so - because you still have to move the string from Julia (which uses either ASCII or UTF-8 for strings by default, you have to specifically convert them to get UTF-16 or UTF-32...) to Python, and then back... and Julia's string conversions are rather slow... O(n^2) in most cases... (I'm working in improving that, I hope I can get my changes accepted into Julia's Base) For some obscure function like Levenshtein distance I might expect this (or not implemented yet in Julia) as Python would use tuned C code or in any function where you need to do non-trivial work per function-call. I failed to add regex to the list as an example as I was pretty sure it was as fast (or faster, because of macros) as Perl as it is using the same library. Similarly for all Unicode/UTF-8 stuff I was not expecting slowness. I know the work on that in Python2/3 and expected Julia could/did similar. No, a lot of the algorithms are O(n) instead of O(1), because of the decision to use UTF-8... I'd like to convince the core team to change Julia to do what Python 3 does. UTF-8 is pretty bad to use for internal string representation (where it shines is an an interchange format). UTF-8 can take up to 50% more storage than UTF-16 if you are just dealing with BMP characters. If you have some field that needs to hold a certain number of Unicode characters, for the full range of Unicode, you need to allocate 4 bytes for every character, so no savings compared to UTF-16 or UTF-32. Python 3 internally stores strings as either: 7-bit (ASCII), 8-bit (ANSI Latin1, only characters 0x100 present), 16-bit (UCS-2, i.e. there are no non-BMP characters present), or 32-bit (UTF-32). You might wonder why there is a special distinction between 7-bit ASCII and 8-bit ANSI Latin 1... they are both Unicode subsets, but 7-bit ASCII can also be used directly without conversion as UTF-8. All internal formats are directly addressable (unlike Julia's UTF8String and UTF16String), and the conversions between the 4 internal types is very fast, simple widening (or a no-op, as in the case of ASCII - ANSI), when going from smaller to larger. Julia also has a big problem with always wanting to have a terminating \0 byte or word, which means that you can't take a substring or slice of another string without making a copy to be able to add that terminating \0 (so lots of extra memory allocation and garbage collection for common algorithms). I hope that makes things a bit clearer! Scott
Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?
Harry, I'm curious about 2 of your 3 last points: On Thursday, April 30, 2015 05:50:15 PM Harry B wrote: (exceptions?, debugging, profiling tools) We have exceptions. What aspect are you referring to? Debugger: yes, that's missing, and it's a huge gap. Profiling tools: in my view we're doing OK (better than Matlab, in my opinion), but what do you see as missing? --Tim Thanks -- Harry On Thursday, April 30, 2015 at 3:43:36 PM UTC-7, Páll Haraldsson wrote: It seemed to me tuples where slow because of Any used. I understand tuples have been fixed, I'm not sure how. I do not remember the post/all the details. Yes, tuples where slow/er than Python. Maybe it was Dict, isn't that kind of a tuple? Now we have Pair in 0.4. I do not have 0.4, maybe I should bite the bullet and install.. I'm not doing anything production related and trying things out and using 0.3[.5] to avoid stability problems.. Then I can't judge the speed.. Another potential issue I saw with tuples (maybe that is not a problem in general, and I do not know that languages do this) is that they can take a lot of memory (to copy around). I was thinking, maybe they should do similar to databases, only use a fixed amount of memory (a page) with a pointer to overflow data.. 2015-04-30 22:13 GMT+00:00 Ali Rezaee arv@gmail.com javascript:: They were interesting questions. I would also like to know why poorly written Julia code sometimes performs worse than similar python code, especially when tuples are involved. Did you say it was fixed? On Thursday, April 30, 2015 at 9:58:35 PM UTC+2, Páll Haraldsson wrote: Hi, [As a best language is subjective, I'll put that aside for a moment.] Part I. The goal, as I understand, for Julia is at least within a factor of two of C and already matching it mostly and long term beating that (and C++). [What other goals are there? How about 0.4 now or even 1.0..?] While that is the goal as a language, you can write slow code in any language and Julia makes that easier. :) [If I recall, Bezanson mentioned it (the global problem) as a feature, any change there?] I've been following this forum for months and newbies hit the same issues. But almost always without fail, Julia can be speed up (easily as Tim Holy says). I'm thinking about the exceptions to that - are there any left? And about the first code slowness (see Part II). Just recently the last two flaws of Julia that I could see where fixed: Decimal floating point is in (I'll look into the 100x slowness, that is probably to be expected of any language, still I think may be a misunderstanding and/or I can do much better). And I understand the tuple slowness has been fixed (that was really the only core language defect). The former wasn't a performance problem (mostly a non existence problem and correctness one (where needed)..). Still we see threads like this one recent one: https://groups.google.com/forum/#!topic/julia-users/-bx9xIfsHHw It seems changing the order of nested loops also helps Obviously Julia can't beat assembly but really C/Fortran is already close enough (within a small factor). The above row vs. column major (caching effects in general) can kill performance in all languages. Putting that newbie mistake aside, is there any reason Julia can be within a small factor of assembly (or C) in all cases already? Part II. Except for caching issues, I still want the most newbie code or intentionally brain-damaged code to run faster than at least Python/scripting/interpreted languages. Potential problems (that I think are solved or at least not problems in theory): 1. I know Any kills performance. Still, isn't that the default in Python (and Ruby, Perl?)? Is there a good reason Julia can't be faster than at least all the so-called scripting languages in all cases (excluding small startup overhead, see below)? 2. The global issue, not sure if that slows other languages down, say Python. Even if it doesn't, should Julia be slower than Python because of global? 3. Garbage collection. I do not see that as a problem, incorrect? Mostly performance variability ([3D] games - subject for another post, as I'm not sure that is even a problem in theory..). Should reference counting (Python) be faster? On the contrary, I think RC and even manual memory management could be slower. 4. Concurrency, see nr. 3. GC may or may not have an issue with it. It can be a problem, what about in Julia? There are concurrent GC algorithms and/or real-time (just not in Julia). Other than GC is there any big (potential) problem for concurrent/parallel? I know about the threads work and new GC in 0.4. 5. Subarrays (array slicing?). Not really what I consider a problem, compared to say C (and Python?). I know 0.4 did optimize it, but what
Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?
a newbie comment: If it can be made a bit more easier to write code that uses all the cores ( I am comparing to Go with its channels), it probably doesn't need to be faster than Python. From an outsider's perspective, @everywhere is inconvenient. pmap etc doesn't cover nearly as many cases as Go channels. May be it is documentation problem. I wouldn't think it would be good to try to extract every last bit of speed when you are 0.4.. there are so many things to cleanup/build in the language and standard library (exceptions?, debugging, profiling tools) Thanks -- Harry On Thursday, April 30, 2015 at 3:43:36 PM UTC-7, Páll Haraldsson wrote: It seemed to me tuples where slow because of Any used. I understand tuples have been fixed, I'm not sure how. I do not remember the post/all the details. Yes, tuples where slow/er than Python. Maybe it was Dict, isn't that kind of a tuple? Now we have Pair in 0.4. I do not have 0.4, maybe I should bite the bullet and install.. I'm not doing anything production related and trying things out and using 0.3[.5] to avoid stability problems.. Then I can't judge the speed.. Another potential issue I saw with tuples (maybe that is not a problem in general, and I do not know that languages do this) is that they can take a lot of memory (to copy around). I was thinking, maybe they should do similar to databases, only use a fixed amount of memory (a page) with a pointer to overflow data.. 2015-04-30 22:13 GMT+00:00 Ali Rezaee arv@gmail.com javascript:: They were interesting questions. I would also like to know why poorly written Julia code sometimes performs worse than similar python code, especially when tuples are involved. Did you say it was fixed? On Thursday, April 30, 2015 at 9:58:35 PM UTC+2, Páll Haraldsson wrote: Hi, [As a best language is subjective, I'll put that aside for a moment.] Part I. The goal, as I understand, for Julia is at least within a factor of two of C and already matching it mostly and long term beating that (and C++). [What other goals are there? How about 0.4 now or even 1.0..?] While that is the goal as a language, you can write slow code in any language and Julia makes that easier. :) [If I recall, Bezanson mentioned it (the global problem) as a feature, any change there?] I've been following this forum for months and newbies hit the same issues. But almost always without fail, Julia can be speed up (easily as Tim Holy says). I'm thinking about the exceptions to that - are there any left? And about the first code slowness (see Part II). Just recently the last two flaws of Julia that I could see where fixed: Decimal floating point is in (I'll look into the 100x slowness, that is probably to be expected of any language, still I think may be a misunderstanding and/or I can do much better). And I understand the tuple slowness has been fixed (that was really the only core language defect). The former wasn't a performance problem (mostly a non existence problem and correctness one (where needed)..). Still we see threads like this one recent one: https://groups.google.com/forum/#!topic/julia-users/-bx9xIfsHHw It seems changing the order of nested loops also helps Obviously Julia can't beat assembly but really C/Fortran is already close enough (within a small factor). The above row vs. column major (caching effects in general) can kill performance in all languages. Putting that newbie mistake aside, is there any reason Julia can be within a small factor of assembly (or C) in all cases already? Part II. Except for caching issues, I still want the most newbie code or intentionally brain-damaged code to run faster than at least Python/scripting/interpreted languages. Potential problems (that I think are solved or at least not problems in theory): 1. I know Any kills performance. Still, isn't that the default in Python (and Ruby, Perl?)? Is there a good reason Julia can't be faster than at least all the so-called scripting languages in all cases (excluding small startup overhead, see below)? 2. The global issue, not sure if that slows other languages down, say Python. Even if it doesn't, should Julia be slower than Python because of global? 3. Garbage collection. I do not see that as a problem, incorrect? Mostly performance variability ([3D] games - subject for another post, as I'm not sure that is even a problem in theory..). Should reference counting (Python) be faster? On the contrary, I think RC and even manual memory management could be slower. 4. Concurrency, see nr. 3. GC may or may not have an issue with it. It can be a problem, what about in Julia? There are concurrent GC algorithms and/or real-time (just not in Julia). Other than GC is there any big (potential) problem for concurrent/parallel? I know about the threads work and new GC in 0.4. 5. Subarrays (array slicing?). Not really what I consider
Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?
Strings have long been a performance sore-spot in julia, so we're glad Scott is hammering on that topic. For interpreted code (including Julia with Any types), it's very possible that Python is and will remain faster. For one thing, Python is single- dispatch, which means that when the interpreter has to go look up the function corresponding to your next expression, typically the list of applicable methods is quite short. In contrast, julia sometimes has to sort through huge method tables to determine the appropriate one to dispatch to. Multiple dispatch adds a lot of power to the language, and there's no performance cost for code that has been compiled, but it does make interpreted code slower. Best, --Tim On Thursday, April 30, 2015 10:34:20 PM Páll Haraldsson wrote: Interesting.. does that mean Unicode then that is esp. faster or something else? 800x faster is way worse than I thought and no good reason for it.. I'm really intrigued what is this slow, can't be the simple things like say just string concatenation?! You can get similar speed using PyCall.jl :) For some obscure function like Levenshtein distance I might expect this (or not implemented yet in Julia) as Python would use tuned C code or in any function where you need to do non-trivial work per function-call. I failed to add regex to the list as an example as I was pretty sure it was as fast (or faster, because of macros) as Perl as it is using the same library. Similarly for all Unicode/UTF-8 stuff I was not expecting slowness. I know the work on that in Python2/3 and expected Julia could/did similar. 2015-04-30 22:10 GMT+00:00 Scott Jones scott.paul.jo...@gmail.com: Yes... Python will win on string processing... esp. with Python 3... I quickly ran into things that were 800x faster in Python... (I hope to help change that though!) Scott On Thursday, April 30, 2015 at 6:01:45 PM UTC-4, Páll Haraldsson wrote: I wouldn't expect a difference in Julia for code like that (didn't check). But I guess what we are often seeing is someone comparing a tuned Python code to newbie Julia code. I still want it faster than that code.. (assuming same algorithm, note row vs. column major caveat). The main point of mine, *should* Python at any time win? 2015-04-30 21:36 GMT+00:00 Sisyphuss zhengw...@gmail.com: This post interests me. I'll write something here to follow this post. The performance gap between normal code in Python and badly-written code in Julia is something I'd like to know too. As far as I know, Python interpret does some mysterious optimizations. For example `(x**2)**2` is 100x faster than `x**4`. On Thursday, April 30, 2015 at 9:58:35 PM UTC+2, Páll Haraldsson wrote: Hi, [As a best language is subjective, I'll put that aside for a moment.] Part I. The goal, as I understand, for Julia is at least within a factor of two of C and already matching it mostly and long term beating that (and C++). [What other goals are there? How about 0.4 now or even 1.0..?] While that is the goal as a language, you can write slow code in any language and Julia makes that easier. :) [If I recall, Bezanson mentioned it (the global problem) as a feature, any change there?] I've been following this forum for months and newbies hit the same issues. But almost always without fail, Julia can be speed up (easily as Tim Holy says). I'm thinking about the exceptions to that - are there any left? And about the first code slowness (see Part II). Just recently the last two flaws of Julia that I could see where fixed: Decimal floating point is in (I'll look into the 100x slowness, that is probably to be expected of any language, still I think may be a misunderstanding and/or I can do much better). And I understand the tuple slowness has been fixed (that was really the only core language defect). The former wasn't a performance problem (mostly a non existence problem and correctness one (where needed)..). Still we see threads like this one recent one: https://groups.google.com/forum/#!topic/julia-users/-bx9xIfsHHw It seems changing the order of nested loops also helps Obviously Julia can't beat assembly but really C/Fortran is already close enough (within a small factor). The above row vs. column major (caching effects in general) can kill performance in all languages. Putting that newbie mistake aside, is there any reason Julia can be within a small factor of assembly (or C) in all cases already? Received server disconnect: b0 'Idle Timeout' Part II. Except for caching issues, I still want the most newbie code or intentionally brain-damaged code to run faster than at least Python/scripting/interpreted languages. Potential problems (that I think are solved or at least not problems in theory): 1. I know Any kills performance. Still, isn't that the
Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?
On Apr 30, 2015, at 9:58 PM, Tim Holy tim.h...@gmail.com wrote: Strings have long been a performance sore-spot in julia, so we're glad Scott is hammering on that topic. Thanks, Tim! I was beginning to think I’d be banned from all Julia forums, for being a thorn in the side of the Julia developers… (I do want to say again… if I didn’t think what all of you had created wasn’t incredibly great, I wouldn’t be so interested in making it even greater, in the particular areas I know a little about… Also, the issues I’ve found are not because the developers aren’t brilliant [I’ve been super impressed, and I don’t impress that easily!], but rather, either it’s outside of their area of expertise [as the numerical computing stuff is outside mine], or they are incredibly busy making great strides in the areas that they are more interested in…) For interpreted code (including Julia with Any types), it's very possible that Python is and will remain faster. For one thing, Python is single- dispatch, which means that when the interpreter has to go look up the function corresponding to your next expression, typically the list of applicable methods is quite short. In contrast, julia sometimes has to sort through huge method tables to determine the appropriate one to dispatch to. Multiple dispatch adds a lot of power to the language, and there's no performance cost for code that has been compiled, but it does make interpreted code slower. Good point… Scott
Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?
Interesting.. does that mean Unicode then that is esp. faster or something else? 800x faster is way worse than I thought and no good reason for it.. I'm really intrigued what is this slow, can't be the simple things like say just string concatenation?! You can get similar speed using PyCall.jl :) For some obscure function like Levenshtein distance I might expect this (or not implemented yet in Julia) as Python would use tuned C code or in any function where you need to do non-trivial work per function-call. I failed to add regex to the list as an example as I was pretty sure it was as fast (or faster, because of macros) as Perl as it is using the same library. Similarly for all Unicode/UTF-8 stuff I was not expecting slowness. I know the work on that in Python2/3 and expected Julia could/did similar. 2015-04-30 22:10 GMT+00:00 Scott Jones scott.paul.jo...@gmail.com: Yes... Python will win on string processing... esp. with Python 3... I quickly ran into things that were 800x faster in Python... (I hope to help change that though!) Scott On Thursday, April 30, 2015 at 6:01:45 PM UTC-4, Páll Haraldsson wrote: I wouldn't expect a difference in Julia for code like that (didn't check). But I guess what we are often seeing is someone comparing a tuned Python code to newbie Julia code. I still want it faster than that code.. (assuming same algorithm, note row vs. column major caveat). The main point of mine, *should* Python at any time win? 2015-04-30 21:36 GMT+00:00 Sisyphuss zhengw...@gmail.com: This post interests me. I'll write something here to follow this post. The performance gap between normal code in Python and badly-written code in Julia is something I'd like to know too. As far as I know, Python interpret does some mysterious optimizations. For example `(x**2)**2` is 100x faster than `x**4`. On Thursday, April 30, 2015 at 9:58:35 PM UTC+2, Páll Haraldsson wrote: Hi, [As a best language is subjective, I'll put that aside for a moment.] Part I. The goal, as I understand, for Julia is at least within a factor of two of C and already matching it mostly and long term beating that (and C++). [What other goals are there? How about 0.4 now or even 1.0..?] While that is the goal as a language, you can write slow code in any language and Julia makes that easier. :) [If I recall, Bezanson mentioned it (the global problem) as a feature, any change there?] I've been following this forum for months and newbies hit the same issues. But almost always without fail, Julia can be speed up (easily as Tim Holy says). I'm thinking about the exceptions to that - are there any left? And about the first code slowness (see Part II). Just recently the last two flaws of Julia that I could see where fixed: Decimal floating point is in (I'll look into the 100x slowness, that is probably to be expected of any language, still I think may be a misunderstanding and/or I can do much better). And I understand the tuple slowness has been fixed (that was really the only core language defect). The former wasn't a performance problem (mostly a non existence problem and correctness one (where needed)..). Still we see threads like this one recent one: https://groups.google.com/forum/#!topic/julia-users/-bx9xIfsHHw It seems changing the order of nested loops also helps Obviously Julia can't beat assembly but really C/Fortran is already close enough (within a small factor). The above row vs. column major (caching effects in general) can kill performance in all languages. Putting that newbie mistake aside, is there any reason Julia can be within a small factor of assembly (or C) in all cases already? Part II. Except for caching issues, I still want the most newbie code or intentionally brain-damaged code to run faster than at least Python/scripting/interpreted languages. Potential problems (that I think are solved or at least not problems in theory): 1. I know Any kills performance. Still, isn't that the default in Python (and Ruby, Perl?)? Is there a good reason Julia can't be faster than at least all the so-called scripting languages in all cases (excluding small startup overhead, see below)? 2. The global issue, not sure if that slows other languages down, say Python. Even if it doesn't, should Julia be slower than Python because of global? 3. Garbage collection. I do not see that as a problem, incorrect? Mostly performance variability ([3D] games - subject for another post, as I'm not sure that is even a problem in theory..). Should reference counting (Python) be faster? On the contrary, I think RC and even manual memory management could be slower. 4. Concurrency, see nr. 3. GC may or may not have an issue with it. It can be a problem, what about in Julia? There are concurrent GC algorithms and/or real-time (just not in Julia). Other than GC is there any big (potential) problem for concurrent/parallel? I know about the threads work
Re: [julia-users] Re: Performance variability - can we expect Julia to be the fastest (best) language?
Yes... Python will win on string processing... esp. with Python 3... I quickly ran into things that were 800x faster in Python... (I hope to help change that though!) Scott On Thursday, April 30, 2015 at 6:01:45 PM UTC-4, Páll Haraldsson wrote: I wouldn't expect a difference in Julia for code like that (didn't check). But I guess what we are often seeing is someone comparing a tuned Python code to newbie Julia code. I still want it faster than that code.. (assuming same algorithm, note row vs. column major caveat). The main point of mine, *should* Python at any time win? 2015-04-30 21:36 GMT+00:00 Sisyphuss zhengw...@gmail.com javascript:: This post interests me. I'll write something here to follow this post. The performance gap between normal code in Python and badly-written code in Julia is something I'd like to know too. As far as I know, Python interpret does some mysterious optimizations. For example `(x**2)**2` is 100x faster than `x**4`. On Thursday, April 30, 2015 at 9:58:35 PM UTC+2, Páll Haraldsson wrote: Hi, [As a best language is subjective, I'll put that aside for a moment.] Part I. The goal, as I understand, for Julia is at least within a factor of two of C and already matching it mostly and long term beating that (and C++). [What other goals are there? How about 0.4 now or even 1.0..?] While that is the goal as a language, you can write slow code in any language and Julia makes that easier. :) [If I recall, Bezanson mentioned it (the global problem) as a feature, any change there?] I've been following this forum for months and newbies hit the same issues. But almost always without fail, Julia can be speed up (easily as Tim Holy says). I'm thinking about the exceptions to that - are there any left? And about the first code slowness (see Part II). Just recently the last two flaws of Julia that I could see where fixed: Decimal floating point is in (I'll look into the 100x slowness, that is probably to be expected of any language, still I think may be a misunderstanding and/or I can do much better). And I understand the tuple slowness has been fixed (that was really the only core language defect). The former wasn't a performance problem (mostly a non existence problem and correctness one (where needed)..). Still we see threads like this one recent one: https://groups.google.com/forum/#!topic/julia-users/-bx9xIfsHHw It seems changing the order of nested loops also helps Obviously Julia can't beat assembly but really C/Fortran is already close enough (within a small factor). The above row vs. column major (caching effects in general) can kill performance in all languages. Putting that newbie mistake aside, is there any reason Julia can be within a small factor of assembly (or C) in all cases already? Part II. Except for caching issues, I still want the most newbie code or intentionally brain-damaged code to run faster than at least Python/scripting/interpreted languages. Potential problems (that I think are solved or at least not problems in theory): 1. I know Any kills performance. Still, isn't that the default in Python (and Ruby, Perl?)? Is there a good reason Julia can't be faster than at least all the so-called scripting languages in all cases (excluding small startup overhead, see below)? 2. The global issue, not sure if that slows other languages down, say Python. Even if it doesn't, should Julia be slower than Python because of global? 3. Garbage collection. I do not see that as a problem, incorrect? Mostly performance variability ([3D] games - subject for another post, as I'm not sure that is even a problem in theory..). Should reference counting (Python) be faster? On the contrary, I think RC and even manual memory management could be slower. 4. Concurrency, see nr. 3. GC may or may not have an issue with it. It can be a problem, what about in Julia? There are concurrent GC algorithms and/or real-time (just not in Julia). Other than GC is there any big (potential) problem for concurrent/parallel? I know about the threads work and new GC in 0.4. 5. Subarrays (array slicing?). Not really what I consider a problem, compared to say C (and Python?). I know 0.4 did optimize it, but what languages do similar stuff? Functional ones? 6. In theory, pure functional languages should be faster. Are they in practice in many or any case? Julia has non-mutable state if needed but maybe not as powerful? This seems a double-edged sword. I think Julia designers intentionally chose mutable state to conserve memory. Pros and cons? Mostly Pros for Julia? 7. Startup time. Python is faster and for say web use, or compared to PHP could be an issue, but would be solved by not doing CGI-style web. How good/fast is Julia/the libraries right now for say web use? At least for long running programs (intended target of Julia) startup time is not an issue. 8.