Re: [julia-users] Re: ANN: NullableArrays.jl package

2015-10-18 Thread Sebastian Good
I’ve considered whether it makes sense to have, say, a NullableBits{T,V} where 
T is the bits type, e.g. Int64 or Float32, and V is the special value which is 
considered null. Then you would use a similar set of functions as defined in 
NullableArray to make operations on these numbers be polluted by the ‘special’ 
nullable values. No extra storage is needed to determine nullability, though of 
course there is the cost of checks against the value. Systems where V was 
unpredictable might result in excessive amounts of code being compiled for the 
various specialized types, e.g. NullableBits{Float32, .0f} vs 
NulllableBits{Float32, .0f}. But this is just speculation on my part; was 
curious if the group had looked at this representation in their explorations.
On October 18, 2015 at 3:54:23 PM, David Gold (david.gol...@gmail.com) wrote:

@Sebastian: If I understand you correctly, then it should at least be possible. 
If T is your datatype that behaves as such and x is the value of type T that 
designates missingness, then it seems you could straightforwardly write your 
own method to convert an Array{T} to a NullableArray{T} that sets a given entry 
in the target isnull field to true iff the corresponding entry in the argument 
array is x. I don't know how the pros and cons will play out for your specific 
use case.

On Saturday, October 17, 2015 at 7:38:24 AM UTC-7, Sebastian Good wrote:
This brings to mind a question that commonly comes up in scientific computing: 
nullable arrays simulated through means of a canonical "no-data value", e.g in 
a domain where all values are expected to be negative, using +. It's ugly, 
but it's really common. From what I can see of the implementation, this is a 
different approach than is used in NullableArrays, where a lookaside table of 
null/not-null is kept. 

Is it sensible or possible to work with this kind of data with this package?

[julia-users] Re: ANN: NullableArrays.jl package

2015-10-17 Thread Sebastian Good
This brings to mind a question that commonly comes up in scientific 
computing: nullable arrays simulated through means of a canonical "no-data 
value", e.g in a domain where all values are expected to be negative, using 
+. It's ugly, but it's really common. From what I can see of the 
implementation, this is a different approach than is used in 
NullableArrays, where a lookaside table of null/not-null is kept. 

Is it sensible or possible to work with this kind of data with this package?


[julia-users] Re: What features are interesting in a VS Code plug-in?

2015-10-17 Thread Sebastian Good
I would imagine most of the things on this list would apply readily to any 
IDE, but since you ask. 

1) One of the best IDE experiences out of MSFT was the F# IDE, where you 
could easily interrogate expression types by hovering. this helped 
immensely in that statically typed language. Now Julia isn't statically 
typed. But there are packages that help highlight potential issues, such as 
"type unstable" functions or obvious mismatches, such as passing a string 
to a function expecting a number. An IDE that made this information 
dynamically available would be powerfully helpful as you move from 
quick-and-dirty to locking-down-types for correctness or performance.

2) "Intellisense" is great because you type '.' and get a list of methods 
that could be useful to the value you have at hand. Julia of course doesn't 
work this way -- instead there are a variety of functions that might be 
useful. If you have an expression with a type provably more specific than 
Any it would be nice to show the output of methodswith for that type on 
some sort of shortcut. Unfortunately this sort of requires you to type 
backwards, but this might be possible with creative use of keyboard 
shortcuts and cursor movement.


Re: [julia-users] Re: The Julia Community Standards

2015-10-10 Thread Sebastian Good
Thanks Stefan.

On Saturday, October 10, 2015 at 12:32:05 PM UTC-5, Stefan Karpinski wrote:
>
> That was not an ad hominem attack, it was a request for you to stop 
> talking over everyone else on a subject about which you've already 
> demonstrated a considerable lack of awareness or insight. When you're 
> spouting a stream of nonsense here, you are effectively excluding everyone 
> else who might have something to say on the subject. This is not the Scott 
> P Jones show. I happen to know that we've nearly lost several valuable 
> community members because of your behavior, of which this thread is an 
> prime example. There are probably numerous others who have been driven 
> away. I've asked politely, and that didn't work, so now I'm afraid you've 
> forced my hand: I've configured your posts here and on julia-dev to be 
> moderated. They will only be allowed through if they are concise, 
> on-subject, and constructive. For the sake of greater inclusiveness, it 
> seems that you must be excluded, or at least somewhat muted.
>
> On Sat, Oct 10, 2015 at 8:54 PM, Scott Jones  > wrote:
>
>>
>>
>> On Saturday, October 10, 2015 at 7:53:58 AM UTC-4, Stefan Karpinski wrote:
>>>
>>> Anthropomorphization is fine, sexualization is not. The main reason that 
>>> using "she" to refer to Julia is not great is that the next thing is so 
>>> often to sexualize the term, not because there's anything objectionable 
>>> about anthropomorphizing Julia. For example, the Julia-tan anime 
>>> character  is acceptable since 
>>> it does imply sexual activity.
>>>
>>
>> I thought that Anthropomorphization was not fine, the JCS states clearly 
>> "the programming language is not a person".
>> Julia-tan can represent a woman scientist/programmer, who happens to love 
>> the julia language, and is not *necessarily* an anthropomorphism.
>>  
>>
>>> That statement "the programming language is not a person and does not 
>>> have a gender" makes perfect sense in any language. While a word may have a 
>>> *grammatical* gender in a language, a programming language is not a 
>>> word, and does not. This basic distinction between a word and what it 
>>> refers to, is especially familiar to speakers of languages with grammatical 
>>> genders since there is often a mismatch between grammatical gender and 
>>> actual gender. For example, in German, "Mädchen" means "girl" but is a 
>>> neuter word, rather than feminine. Do you think that Germans are confused 
>>> about the actual gender of girls? To quote the wikipedia article about 
>>> grammatical 
>>> gender :
>>>
>>> In a few languages, the gender assignation of nouns is solely determined 
 by their meaning or attributes, like biological sex, humanness, animacy. 
 However, in most languages, this semantic division is only partially 
 valid, 
 and many nouns may belong to a gender category that contrasts with their 
 meaning (e.g. the word for "manliness" could be of feminine gender). In 
 this case, the gender assignation can also be influenced by the morphology 
 or phonology of the noun, or in some cases can be apparently arbitrary.
>>>
>>>
>> That depends on how it is translated.  In Spanish, "género" by itself 
>> would generally mean grammatical gender, and you'd say "sexo", o possibly 
>> "género natural",
>> which is why the current phrasing might not really be all that clear to 
>> somebody whose first language is Spanish, for example.
>> I'm not saying that the point is wrong, just that it should be made 
>> clearer, as other people have already agreed.
>>
>> Anyway, I think we've already heard plenty from Scott P. Jones on this 
>>> subject. Please refrain from further commentary here, Scott – you've 
>>> already said more than your share and you are literally the single most 
>>> frequent violator of our community standards, having both made various 
>>> sexual jokes about "Julia" and chronically wasting people's time, energy 
>>> and patience.
>>>
>>
>> Please refrain from constant ad hominem attacks, here and on GitHub.  
>> They definitely do not fit into the "*Be respectful and inclusive" *part 
>> of the JCS.  Threatening banning, deleting posts, defending other people 
>> who make ad hominem attacks, as well as using sexual language in ad hominem 
>> attacks (and never once apologizing) are definitely things that don't fit 
>> the JCS at all.
>>
>>
>

Re: [julia-users] Re: Deploying Julia libraries

2015-09-30 Thread Sebastian Good
Setting JULIA_PKGPATH lets me put everything in a more amenable spot, but 
transporting everything over to another identical machine results in 
inconsistent behavior with precompiled binaries. They are always “stale”. Is 
there a good resource to consult to understand binary staleness?

Thanks
On September 30, 2015 at 3:09:21 PM, Steven G. Johnson (stevenj@gmail.com) 
wrote:

Just install the packages into some directory, and add that directory to the 
LOAD_PATH.   You can also precompile them and put the .ji files in some 
directory that you add to the 
LOAD_CACHE_PATH path.   That way your users will get a fixed set of packages at 
a known version, won't need Internet access, and won't need to precompile.

[julia-users] Deploying Julia libraries

2015-09-30 Thread Sebastian Good
With the release of v0.4 imminent, I thought I'd see what the latest 
thinking was on deploying Julia libraries (or even executables) to foreign 
machines. I see several steps required to run a Julia program:

1. Install Julia. This is easy enough to do in a system image.
2. Add/clone packages. This can be scripted, but is at least time consuming 
an more commonly dangerous: a production machine reaching out to the 
internet is frowned upon.
3. Precompile. This can be done on demand but also takes time and 
introduces another step that could theoretically fail. 

Is it possible to pre-compile sets of packages into a portable binary, or 
at least one that assumes only a base installation of Julia?

Thanks


[julia-users] Job posting: science startup

2015-08-27 Thread Sebastian Good
I'm hiring Julia partisans to join me in a stealth-mode science-focused 
startup doing computational math and visualization -- both interactive and 
simulation based, including code running on large clusters. Why Julia? 
We're in a math-heavy domain, speed matters, and we want well-rounded 
hackers to solve the many problems our problem space brings together: 
compute on demand, turning math into code, sophisticated user collaboration 
in a browser, and high end visualization. We're using a combination of 
JavaScript and Julia to get the job done and have loved the great 
community, fast prototyping and high performance we get out of 
Julia.  Areas such as compiling to JavaScript, doing GPU computation, cloud 
APIs, and high-speed networking are likely to be where we can give back to 
that community. We're a distributed team with offices in NYC, Houston  New 
Orleans. Shoot me an email if you're interested in using Julia in a day job 
-- or know someone who should be.


[julia-users] reinterpret SubArray

2015-08-09 Thread Sebastian Good
I found myself trying to reinterpret a subarray today, and it didn't work. 
There are obviously plenty of cases where this makes no sense, but it would 
be convenient where it did. Is this on someone's roadmap? For instance:

reinterpret(Int16, sub(readbytes(...), 5:10))
  


[julia-users] libgfortran in a linux install

2015-06-25 Thread Sebastian Good
Dumb packaging question:

On Linux (Ubuntu 14), running make binary-dist produces a nice relocatable 
tarball with relative rpath linking. However it doesn't work unless 
libgfortran has been installed on the system. I found this odd because 
libgfortran appears in the build directory (e.g. 
julia-cb77503114/lib/julia/libgfortran.so.3). Why is it not included in the 
tarball? Is there a make flag that can force this? Or is it avoided for a 
licensing reason?

Thanks,

Sebastian


Re: [julia-users] libgfortran in a linux install

2015-06-25 Thread Sebastian Good
Nope, not there.

Also in the tarball there is a sys.ji but not a sys.so. In this case sys.so is 
in the correct location, but didn’t make it into the tarball.
On June 25, 2015 at 5:40:10 PM, Elliot Saba (staticfl...@gmail.com) wrote:

So the file /home/vagrant/julia/julia-cb77503114/lib/julia/libgfortran.so.3 
does not exist?
-E

On Thu, Jun 25, 2015 at 3:18 PM, Sebastian Good 
sebast...@palladiumconsulting.com wrote:
No apparent problems

./contrib/fixup-libgfortran.sh /home/vagrant/julia/julia-cb77503114/lib/julia
‘/lib/x86_64-linux-gnu/libgcc_s.so.1’ - 
‘/home/vagrant/julia/julia-cb77503114/lib/julia/libgcc_s.so.1’
‘/usr/lib/x86_64-linux-gnu/libgfortran.so.3’ - 
‘/home/vagrant/julia/julia-cb77503114/lib/julia/libgfortran.so.3’
‘/usr/lib/x86_64-linux-gnu/libquadmath.so.0’ - 
‘/home/vagrant/julia/julia-cb77503114/lib/julia/libquadmath.so.0’
Found traces of libgfortran/libgcc in /lib/x86_64-linux-gnu 
/usr/lib/x86_64-linux-gnu 

But libgfortran doesn’t appear in the tar file anywhere. make install doesn’t 
put it in a lib folder anywhere either.
On June 25, 2015 at 4:04:23 PM, Elliot Saba (staticfl...@gmail.com) wrote:

Hmmm, well, it should, so that's worrying that it's not doing so on your 
system.  When you run `make binary-dist`, you the makefile should run a script 
called `fixup-libgfortran.sh`, which does exactly what you want.  Here is an 
example log of what it should look like when you run `make binary-dist`, the 
important part of which is:

./contrib/fixup-libgfortran.sh 
/home/centos/buildbot/slave/package_tarball64/build/julia-1e081b79ed/lib/julia
`/home/centos/local/lib64/libgcc_s.so.1' - 
`/home/centos/buildbot/slave/package_tarball64/build/julia-1e081b79ed/lib/julia/libgcc_s.so.1'
`/home/centos/local/lib64/libgfortran.so.3' - 
`/home/centos/buildbot/slave/package_tarball64/build/julia-1e081b79ed/lib/julia/libgfortran.so.3'
`/home/centos/local/lib64/libquadmath.so.0' - 
`/home/centos/buildbot/slave/package_tarball64/build/julia-1e081b79ed/lib/julia/libquadmath.so.0'
Found traces of libgfortran/libgcc in /home/centos/local/lib64
./contrib/fixup-libstdc++.sh 
/home/centos/buildbot/slave/package_tarball64/build/julia-1e081b79ed/lib/julia
`/home/centos/local/lib64/libstdc++.so.6' - 
`/home/centos/buildbot/slave/package_tarball64/build/julia-1e081b79ed/lib/julia/libstdc++.so.6'

This is showing how it finds and installs libgfortran.so.3, you should see 
something similar in your installs as well.  This should run as long as you're 
not running on windows, so there must be some kind of error happening that is 
hopefully printed out to the console.
-E



Re: [julia-users] libgfortran in a linux install

2015-06-25 Thread Sebastian Good
No apparent problems

./contrib/fixup-libgfortran.sh /home/vagrant/julia/julia-cb77503114/lib/julia
‘/lib/x86_64-linux-gnu/libgcc_s.so.1’ - 
‘/home/vagrant/julia/julia-cb77503114/lib/julia/libgcc_s.so.1’
‘/usr/lib/x86_64-linux-gnu/libgfortran.so.3’ - 
‘/home/vagrant/julia/julia-cb77503114/lib/julia/libgfortran.so.3’
‘/usr/lib/x86_64-linux-gnu/libquadmath.so.0’ - 
‘/home/vagrant/julia/julia-cb77503114/lib/julia/libquadmath.so.0’
Found traces of libgfortran/libgcc in /lib/x86_64-linux-gnu 
/usr/lib/x86_64-linux-gnu 

But libgfortran doesn’t appear in the tar file anywhere. make install doesn’t 
put it in a lib folder anywhere either.
On June 25, 2015 at 4:04:23 PM, Elliot Saba (staticfl...@gmail.com) wrote:

Hmmm, well, it should, so that's worrying that it's not doing so on your 
system.  When you run `make binary-dist`, you the makefile should run a script 
called `fixup-libgfortran.sh`, which does exactly what you want.  Here is an 
example log of what it should look like when you run `make binary-dist`, the 
important part of which is:

./contrib/fixup-libgfortran.sh 
/home/centos/buildbot/slave/package_tarball64/build/julia-1e081b79ed/lib/julia
`/home/centos/local/lib64/libgcc_s.so.1' - 
`/home/centos/buildbot/slave/package_tarball64/build/julia-1e081b79ed/lib/julia/libgcc_s.so.1'
`/home/centos/local/lib64/libgfortran.so.3' - 
`/home/centos/buildbot/slave/package_tarball64/build/julia-1e081b79ed/lib/julia/libgfortran.so.3'
`/home/centos/local/lib64/libquadmath.so.0' - 
`/home/centos/buildbot/slave/package_tarball64/build/julia-1e081b79ed/lib/julia/libquadmath.so.0'
Found traces of libgfortran/libgcc in /home/centos/local/lib64
./contrib/fixup-libstdc++.sh 
/home/centos/buildbot/slave/package_tarball64/build/julia-1e081b79ed/lib/julia
`/home/centos/local/lib64/libstdc++.so.6' - 
`/home/centos/buildbot/slave/package_tarball64/build/julia-1e081b79ed/lib/julia/libstdc++.so.6'

This is showing how it finds and installs libgfortran.so.3, you should see 
something similar in your installs as well.  This should run as long as you're 
not running on windows, so there must be some kind of error happening that is 
hopefully printed out to the console.
-E

Re: [julia-users] Current Performance w Trunk Compared to 0.3

2015-06-11 Thread Sebastian Good
I've seen the same. Looked away for a few weeks, and my code got ~5x 
slower. There's a lot going on so it's hard to say without detailed 
testing. However this code was always very sensitive to optimization to be 
able to specialize code which read data of different types. I got massive 
increases in memory allocations. I'll try to narrow it down, but it seems 
like perhaps something was done with optimization passes or type inference?

On Wednesday, June 10, 2015 at 9:31:59 AM UTC-4, Kevin Squire wrote:

 Short answer: no, poor performance across the board is not a known issue.

 Just curious, do you see these timing issues locally as well?  In other 
 words, is it a problem with Julia, or a problem with Travis (the continuous 
 integration framework)?

 It might be the case that some changes in v0.4 have (possibly 
 inadvertantly) slowed down certain workflows compared with v0.3, whereas 
 others are unchanged or even faster.  

 Could you run profiling and see what parts of the code are the slowest, 
 and then file issues for any slowdowns, with (preferably minimal) examples?

 Cheers,
Kevin

 On Wed, Jun 10, 2015 at 9:10 AM, andrew cooke and...@acooke.org 
 javascript: wrote:


 Is it the current poor performance / allocation a known issue?

 I don't know how long this has been going on, and searching for 
 performance in issues gives a lot of hits, but I've been maintaining some 
 old projects and noticed that timed tests are running significant;y slower 
 with trunk than 0.3.  CRC.jl was 40x slower - I ended up cancelling the 
 Travis build, and assumed it was a weird glitch that would be fixed.  But 
 now I am seeing slowdowns with IntModN.jl too (factor more like 4x as slow).

 You can see this at https://travis-ci.org/andrewcooke/IntModN.jl 
 (compare the timing results in the two jobs) and at 
 https://travis-ci.org/andrewcooke/CRC.jl/builds/66140801 (i have been 
 cancelling jobs there, so the examples aren't as complete).

 Andrew




[julia-users] Re: Announcement: Escher.jl - a toolkit for beautiful Web UIs in pure Julia

2015-06-09 Thread Sebastian Good
Looks pretty interesting and in line with a lot of current web framework 
thinking. Are there any pieces inside Escher which help with compiling 
Julia code to JavaScript? Or is it all driven from the server?

On Monday, June 8, 2015 at 12:23:21 PM UTC-4, Shashi Gowda wrote:

 Hello all!

 I have been working on a package called *Escher* over the past several 
 months.

 It is now quite feature-rich and ready to use in some sense. I put 
 together an overview here:

https://shashi.github.io/Escher.jl/*

 My aim is to converge at a UI toolkit that any Julia programmer can use to 
 create rich interactive GUIs and deploy them over the web, *within 
 minutes*.

 Escher simplifies the web platform into a simple and pleasant pure-Julia 
 library. You don't need to learn or write HTML or CSS or JavaScript. Many 
 problems associated with traditional web development basically disappear. 
 There is no need to write separate front-end and back-end code, layouts are 
 tractable and similar to layouts in the original TeX. Communication is done 
 under-the-hood as and when required. No boiler plate code. Things just look 
 great by default.

 Briefly, here is how Escher works under the hood:

 - A UI is an immutable Julia value that is converted to a Virtual DOM 
 http://en.wikipedia.org/wiki/React_%28JavaScript_library%29#Virtual_DOM 
 using the Patchwork https://github.com/shashi/Patchwork.jl library.
   Compose graphics and Gadfly plots also get rendered to Virtual DOM as 
 well.
 - Subsequent updates to a UI are sent as patches to the current UI over a 
 websocket connection
 - Input widgets send messages to the server over the same websocket 
 connection
 - Complex things like tabs, slideshows, code editor, TeX and dropdown 
 menus are set up as custom HTML elements using the Web Component 
 http://webcomponents.org/ specification, mainly using the Polymer 
 library http://polymer-project.org/. These things are just Virtual DOM 
 nodes in the end.


 This is still a work in progress, I am very happy to receive any critique 
 on this thread, and bug reports on Github 
 https://github.com/shashi/Escher.jl. I am very excited to see what you 
 will create with Escher.

 Thanks! :)
 Shashi

 * - Escher uses some bleeding-edge Web features, this page might not work 
 so well on Safari, should work well on a decently new Chrome, and on 
 Firefox if you wait for long enough for all the assets to load. I will be 
 fixing this in due time, and also working on a cross-browser testing 
 framework.

 PS: I have been dealing with RSI issues of late and my hands will 
 appreciate any help with expanding the documentation! See 
 https://github.com/shashi/Escher.jl/issues/26 if you wish to help.



Re: [julia-users] type stable leading_zeros etc.

2015-05-01 Thread Sebastian Good
https://github.com/JuliaLang/julia/pull/11087

On April 30, 2015 at 5:56:02 PM, Stefan Karpinski (ste...@karpinski.org) wrote:
Yeah, this seems like a reasonable change. If you want to make a PR, this 
shouldn't be too hard. Change the relevant definitions, run `make testall` and 
see what breaks, fix it, repeat. It will potentially cause some breakage in 
packages, but this is a good time for that and it shouldn't be too bad.

On Thu, Apr 30, 2015 at 2:39 PM, Sebastian Good 
sebast...@palladiumconsulting.com wrote:
And I guess as a matter of practicality, a vectorized leading_zeros instruction 
should leave its results in the same sized registers as it started, or it would 
only be possible on Int64s, though I don’t know if LLVM is doing that just yet.
On April 30, 2015 at 2:36:53 PM, Sebastian Good 
(sebast...@palladiumconsulting.com) wrote:

Existing compiler intrinsics work this way (__lzcnt, __lzcnt64, __lzcnt16), It 
came up for me in the following line of code in StreamingStats

    ρ(s::Uint32) = uint32(uint32(leading_zeros(s)) + 0x0001)

The outer uint32 is no longer necessary in v0.4 because the addition no longer 
expands 32-bit operands to a 64-bit result. The inner one is still necessary 
because leading_zeros does. I imagine there are many little functions like this 
that should probably act the same way.

I ran into in my own code for converting IBM/370 floating points to IEEE

    local norml::UInt32 = leading_zeros(fr)
    fr = norml
    ex = (ex  2) - 130 - norml

Where I had to convert  norml to a UInt32 to preserve type stability in the bit 
shifting operation below, where I’m working with 32 bit numbers. Leaving this 
convert out causes the usual massive slowdown in speed when converting tens of 
millions of numbers.

Arguments I can make for making them have the same type — recognizing this is 
quite subjective!

- If you’re doing something with leading_zeros, you’re aware you’re working 
directly in an integer register in binary code; you’re trying to do something 
clever and you’ll want type stability
- No binary number could have more leading zeros than it itself could represent 
- The existing intrinsics are written this way
- Because I ran into it twice and wished it were that way both times! :-D



On April 30, 2015 at 2:16:26 PM, Stefan Karpinski (ste...@karpinski.org) wrote:

I'm not sure why the result of leading_zeros should be of the same type as the 
argument. What's the use case?
 



Re: [julia-users] performance of functions

2015-04-30 Thread Sebastian Good
@anon is a nice piece of functionality but translating it to work 
post-tupocalypse turns out to be more than I can currently grok! Tuples of 
types aren’t types anymore so the mechanics of the @generated functions require 
some changing. Wish I could help; any hints?
On April 30, 2015 at 5:30:57 AM, Tim Holy (tim.h...@gmail.com) wrote:

Check the SO post again; there are now many suggested workarounds, some of  
which are not a big hit to readability.  

And no, this won't be fixed in 0.4.  

--Tim  

On Wednesday, April 29, 2015 08:57:46 PM Sebastian Good wrote:  
 I ran into this issue today  
 (http://stackoverflow.com/questions/26173635/performance-penalty-using-anony  
 mous-function-in-julia) whereby functions -- whether anonymous or not --  
 generate lots of garbage when called indirectly. That is when using type  
 signatures like  
 clever_function(f::Function).  
  
 Is making this code more efficient in scope for the 0.4 release? The  
 workaround -- hand-coding types then using type dispatch -- is quite  
 effective, but definitely a big hit to readability.  



Re: [julia-users] type stable leading_zeros etc.

2015-04-30 Thread Sebastian Good
And I guess as a matter of practicality, a vectorized leading_zeros instruction 
should leave its results in the same sized registers as it started, or it would 
only be possible on Int64s, though I don’t know if LLVM is doing that just yet.
On April 30, 2015 at 2:36:53 PM, Sebastian Good 
(sebast...@palladiumconsulting.com) wrote:

Existing compiler intrinsics work this way (__lzcnt, __lzcnt64, __lzcnt16), It 
came up for me in the following line of code in StreamingStats

    ρ(s::Uint32) = uint32(uint32(leading_zeros(s)) + 0x0001)

The outer uint32 is no longer necessary in v0.4 because the addition no longer 
expands 32-bit operands to a 64-bit result. The inner one is still necessary 
because leading_zeros does. I imagine there are many little functions like this 
that should probably act the same way.

I ran into in my own code for converting IBM/370 floating points to IEEE

    local norml::UInt32 = leading_zeros(fr)
    fr = norml
    ex = (ex  2) - 130 - norml

Where I had to convert  norml to a UInt32 to preserve type stability in the bit 
shifting operation below, where I’m working with 32 bit numbers. Leaving this 
convert out causes the usual massive slowdown in speed when converting tens of 
millions of numbers.

Arguments I can make for making them have the same type — recognizing this is 
quite subjective!

- If you’re doing something with leading_zeros, you’re aware you’re working 
directly in an integer register in binary code; you’re trying to do something 
clever and you’ll want type stability
- No binary number could have more leading zeros than it itself could represent 
- The existing intrinsics are written this way
- Because I ran into it twice and wished it were that way both times! :-D



On April 30, 2015 at 2:16:26 PM, Stefan Karpinski (ste...@karpinski.org) wrote:

I'm not sure why the result of leading_zeros should be of the same type as the 
argument. What's the use case?
 

Re: [julia-users] type stable leading_zeros etc.

2015-04-30 Thread Sebastian Good
Existing compiler intrinsics work this way (__lzcnt, __lzcnt64, __lzcnt16), It 
came up for me in the following line of code in StreamingStats

    ρ(s::Uint32) = uint32(uint32(leading_zeros(s)) + 0x0001)

The outer uint32 is no longer necessary in v0.4 because the addition no longer 
expands 32-bit operands to a 64-bit result. The inner one is still necessary 
because leading_zeros does. I imagine there are many little functions like this 
that should probably act the same way.

I ran into in my own code for converting IBM/370 floating points to IEEE

    local norml::UInt32 = leading_zeros(fr)
    fr = norml
    ex = (ex  2) - 130 - norml

Where I had to convert  norml to a UInt32 to preserve type stability in the bit 
shifting operation below, where I’m working with 32 bit numbers. Leaving this 
convert out causes the usual massive slowdown in speed when converting tens of 
millions of numbers.

Arguments I can make for making them have the same type — recognizing this is 
quite subjective!

- If you’re doing something with leading_zeros, you’re aware you’re working 
directly in an integer register in binary code; you’re trying to do something 
clever and you’ll want type stability
- No binary number could have more leading zeros than it itself could represent 
- The existing intrinsics are written this way
- Because I ran into it twice and wished it were that way both times! :-D



On April 30, 2015 at 2:16:26 PM, Stefan Karpinski (ste...@karpinski.org) wrote:

I'm not sure why the result of leading_zeros should be of the same type as the 
argument. What's the use case?
 

[julia-users] type stable leading_zeros etc.

2015-04-30 Thread Sebastian Good
Recent 0.4 changes made expressions like Int32(1) + Int32(2) type stable, 
i.e returning Int32 instead of Int64 as they did previously (on a 64-bit 
system, anyway). However leading_zeros seems to always return an Int64. I 
wonder if it makes sense to convert the result of leading_zeros to the same 
type as its argument, at least for the base Integer bittypes?

FWIW, this came up while updating StreamStats.jl for v0.4, where a few 
functions had to take care to convert intermediate results back to Int32.


[julia-users] idiom for range + offset

2015-04-29 Thread Sebastian Good
I find myself creating ranges of the form i:i+num-1. That is, the ith 
thing, and num more. Whenever I see a +1 or -1 I worry about careless off 
by one errors, so I wonder if there is a function or operator to capture 
this idiom, or open and closed ranges in general?


Re: [julia-users] idiom for range + offset

2015-04-29 Thread Sebastian Good
Perfect, thanks!
On April 29, 2015 at 10:27:04 AM, René Donner (li...@donner.at) wrote:

I believe range(i,num) is what you are looking for.  


Am 29.04.2015 um 16:22 schrieb Sebastian Good 
sebast...@palladiumconsulting.com:  

 I find myself creating ranges of the form i:i+num-1. That is, the ith thing, 
 and num more. Whenever I see a +1 or -1 I worry about careless off by one 
 errors, so I wonder if there is a function or operator to capture this idiom, 
 or open and closed ranges in general?  



[julia-users] performance of functions

2015-04-29 Thread Sebastian Good
I ran into this issue today 
(http://stackoverflow.com/questions/26173635/performance-penalty-using-anonymous-function-in-julia)
 
whereby functions -- whether anonymous or not -- generate lots of garbage 
when called indirectly. That is when using type signatures like 
clever_function(f::Function).

Is making this code more efficient in scope for the 0.4 release? The 
workaround -- hand-coding types then using type dispatch -- is quite 
effective, but definitely a big hit to readability.


Re: [julia-users] Auto warn for unstable types

2015-04-25 Thread Sebastian Good
Iain, I love this idea! Even languages like JavaScript  and Visual Basic 
which embraced dynamic typing added directives to help users avoid the most 
obvious problems. With the wonderful richness of Julia's type system, it 
would be nice to be able to 'opt in' to a fully enforced function or 
module. I'm guessing this will be much more useful when we can type 
functions?

On Friday, April 24, 2015 at 12:49:22 PM UTC-4, Iain Dunning wrote:

 I think something I'd use was a @strict macro that annotates a function as 
 something I'm willing to get plenty of warnings from, that'd be quite nice. 
 We have the ability to add such compiler flags now, right?

 On Friday, April 24, 2015 at 12:21:06 PM UTC-4, Peter Brady wrote:

 Other startup flags have a user/all option which would conceivably solve 
 the problem of getting too many warnings on startup.  My views are likely 
 colored by my use case - solving systems of PDEs.  Since my work is 
 strictly numerical, I've never met an ::Any that served a useful purpose.   

 On Friday, April 24, 2015 at 9:50:17 AM UTC-6, Tim Holy wrote:

 Related ongoing discussion in 
 https://github.com/JuliaLang/julia/issues/10980 

 But I don't think it's practical or desirable to warn about all type 
 instability; there are plenty of cases where it's either a useful or 
 unavoidable property. The goal of optimization should be to eliminate 
 those 
 cases that actually matter for performance, and not worry about the ones 
 that 
 don't. If you run your code (or just, start julia) and see 100 warnings 
 scroll 
 past, you won't know where to begin. 

 --Tim 

 On Friday, April 24, 2015 11:12:57 AM Stefan Karpinski wrote: 
  Yes, I'd like to add exactly this kind of thing. 
  
  On Fri, Apr 24, 2015 at 10:54 AM, Peter Brady peter...@gmail.com 
 wrote: 
   Tim Holy introduced me to the wonders of @code_warntype in this 
 discussion 
   https://groups.google.com/forum/#!topic/julia-users/sq5gj-3TdQU. 
  I've 
   since been using it to track down other instabilities in my code 
 since it 
   turns out that I'm very good at writing poor julia code.  Are there 
 any 
   plans to incorporate automatic warnings about type unstable 
 functions when 
   they are compiled?  Maybe even via a startup flag like `-Wunstable`? 
  I 
   would prefer that its on by default.  This would go a long way 
 towards 
   helping me write much better code and probably help new users get 
 more of 
   the performance they were expecting. 



[julia-users] Re: Is there a plotting package that works for a current 0.4 build?

2015-04-25 Thread Sebastian Good
Gadfly is very close or nearly there, though it relies on changes to some 
downstream packages and I don't know if those have all been updated such 
that you'll get the edge versions if you grab Gadfly.

https://github.com/dcjones/Gadfly.jl/pull/587
https://github.com/dcjones/Gadfly.jl/commit/4daa1759dbf90a10936e2bd21e0b3d0da70b8923


On Saturday, April 25, 2015 at 9:10:43 AM UTC-4, Steven G. Johnson wrote:

 PyPlot was updated for the tupocolypse; I don't know if it has broken 
 again in the last day or two, though, but it should be easy to fix up again.



Re: [julia-users] Re: Non-GPL Julia?

2015-04-20 Thread Sebastian Good
+1

On April 20, 2015 at 9:18:10 AM, Jay Kickliter (jay.kickli...@gmail.com) wrote:
Thanks Viral

On Sunday, April 19, 2015 at 10:05:23 AM UTC-6, Viral Shah wrote:
And it is merged now. 

On Saturday, April 18, 2015 at 4:22:26 PM UTC+5:30, Scott Jones wrote:
That's great! That solves our dilemma for us!

Scott

Re: [julia-users] zero cost subarray?

2015-04-19 Thread Sebastian Good
—track-allocation still requires guesswork, as optimizations can move the 
allocation to a different place than you would expect.
On April 19, 2015 at 4:36:19 PM, Peter Brady (petertbr...@gmail.com) wrote:

So I discovered the --track-allocation option and now I am really confused:

Here's my session:

$ julia --track-allocation=all
               _
   _       _ _(_)_     |  A fresh approach to technical computing
  (_)     | (_) (_)    |  Documentation: http://docs.julialang.org
   _ _   _| |_  __ _   |  Type help() for help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 0.3.8-pre+13 (2015-04-17 18:08 UTC)
 _/ |\__'_|_|_|\__'_|  |  Commit 0df962d* (2 days old release-0.3)
|__/                   |  x86_64-redhat-linux

julia include(test.jl)
test_all (generic function with 1 method)

julia test_unsafe(5)

And here's the relevant part of the resulting test.jl.mem file.  Note that I 
commented out some calls to 'size' and replaced with the appropriate hard-coded 
values but the resulting allocation is the same... Can anyone shed some light 
on this while I wait for 0.4 to compile?

        - function update(a::AbstractArray, idx, off)
  8151120     for i=1:320 #size(a, idx)
        0         a[i] = -10*off+i
        -     end
        0     a
        - end
        - 
       - function setk_UnSafe{T}(a::Array{T,3})
      760     us = UnsafeSlice(a, 3)
        0     for j=1:size(a,2),i=1:size(a,1)
  8151120         us.start = (j-1)*320+i #size(a,1)+i
        -         #off = sub2ind(size(a), i, j, 1)
        0         update(us, 3, us.start)
        -     end
        0     a
        - end
        - function test_unsafe(n)
        0     a = zeros(Int, (320, 320, 320))
        -     # warmup
        0     setk_UnSafe(a);
        0     clear_malloc_data()
        -     #@time (
        0     for i=1:n; setk_UnSafe(a); end
        - end


On Sunday, April 19, 2015 at 2:21:56 PM UTC-6, Peter Brady wrote:
@Dahua, thanks for adding an unsafeview!  I appreciate how quickly this 
community responds.

I've added the following function to my test.jl script
function setk_unsafeview{T}(a::Array{T,3})
    for j=1:size(a,2),i=1:size(a,1)
        off = sub2ind(size(a), i, j, 1)
        update(unsafe_view(a, i, j, :), 3, off)
    end
    a
end
 But I'm not seeing the large increase in performance I was expecting.  My 
timings are now

julia test_all(5);
test_stride
elapsed time: 2.156173128 seconds (0 bytes allocated)
test_view
elapsed time: 9.30964534 seconds (94208000 bytes allocated, 0.47% gc time)
test_unsafe
elapsed time: 2.169307471 seconds (16303000 bytes allocated)
test_unsafeview
elapsed time: 8.955876793 seconds (90112000 bytes allocated, 0.41% gc time)

To be fair, I am cheating a bit with my custom 'UnsafeSlice' since I make only 
one instance and simply update the offset on each iteration.  If I make it 
immutable and create a new instance on every iteration (as I do for the view 
and unsafeview), things slow down a little and the allocation goes south:

julia test_all(5);
test_stride
elapsed time: 2.159909265 seconds (0 bytes allocated)
test_view
elapsed time: 9.029025282 seconds (94208000 bytes allocated, 0.43% gc time)
test_unsafe
elapsed time: 2.621667854 seconds (114606240 bytes allocated, 2.41% gc time)
test_unsafeview
elapsed time: 8.888434466 seconds (90112000 bytes allocated, 0.44% gc time)

These are all with 0.3.8-pre.  I'll try compiling master and see what happens.  
I'm still confused about why allocating a single type with a pointer, 2 ints 
and a tuple costs so much memory though.



On Sunday, April 19, 2015 at 11:38:17 AM UTC-6, Tim Holy wrote:
It's not just escape analysis, as this (new) issue demonstrates:
https://github.com/JuliaLang/julia/issues/10899

--Tim

On Sunday, April 19, 2015 12:33:51 PM Sebastian Good wrote:
 Their size seems much decreased. I’d imagine to totally avoid allocation in
 this benchmark requires an optimization that really has nothing to do with
 subarrays per se. You’d have to do an escape analysis and see that Aj never
 left sumcols. Not easy in practice, since it’s passed to slice and length,
 and you’d have to make sure they didn’t squirrel it away or pass it on to
 someone else. Then you could stack allocate it, or even destructure it into
 a bunch of scalar mutations on the stack. After eliminating dead code,
 you’d end up with a no-allocation loop much like you’d write by hand. This
 sort of optimization seems to be quite tricky for compilers to pull off,
 but it’s a common pattern in numerical code.

 In Julia is such cleverness left entirely to LLVM, or are there optimization
 passes in Julia itself? On April 19, 2015 at 6:49:21 AM, Tim Holy
 (tim@gmail.com) wrote:

 Sorry to be slow to chime in here, but the tuple overhaul has landed and
 they are still not zero-cost:

 function sumcols(A)
 s = 0.0
 for j = 1:size(A,2)
 Aj = slice(A, :, j)
 for i = 1:length(Aj)
 s += Aj[i]
 end
 end
 s
 end

 Even in the latest 0.4, this still allocates memory

Re: [julia-users] zero cost subarray?

2015-04-19 Thread Sebastian Good
Optimizing the creation of many small structures during execution typically 
comes down to either cleverly eliminating the need to allocate them in the 
first place (via escape analysis, and the like) or making the first generation 
of the garbage collector wickedly fast. I understand both of these are being 
worked.
On April 19, 2015 at 8:08:53 PM, Dahua Lin (linda...@gmail.com) wrote:

My benchmark shows that element indexing has been as fast as it can be for 
array views (or subarrays in Julia 0.4). 

Now the problem is actually the construction of views/subarrays. To optimize 
the overhead of this part, the compiler may need to introduce additional 
optimization.

Dahua 


On Monday, April 20, 2015 at 6:39:35 AM UTC+8, Sebastian Good wrote:
—track-allocation still requires guesswork, as optimizations can move the 
allocation to a different place than you would expect.
On April 19, 2015 at 4:36:19 PM, Peter Brady (peter...@gmail.com) wrote:

So I discovered the --track-allocation option and now I am really confused:

Here's my session:

$ julia --track-allocation=all
               _
   _       _ _(_)_     |  A fresh approach to technical computing
  (_)     | (_) (_)    |  Documentation: http://docs.julialang.org
   _ _   _| |_  __ _   |  Type help() for help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 0.3.8-pre+13 (2015-04-17 18:08 UTC)
 _/ |\__'_|_|_|\__'_|  |  Commit 0df962d* (2 days old release-0.3)
|__/                   |  x86_64-redhat-linux

julia include(test.jl)
test_all (generic function with 1 method)

julia test_unsafe(5)

And here's the relevant part of the resulting test.jl.mem file.  Note that I 
commented out some calls to 'size' and replaced with the appropriate hard-coded 
values but the resulting allocation is the same... Can anyone shed some light 
on this while I wait for 0.4 to compile?

        - function update(a::AbstractArray, idx, off)
  8151120     for i=1:320 #size(a, idx)
        0         a[i] = -10*off+i
        -     end
        0     a
        - end
        - 
       - function setk_UnSafe{T}(a::Array{T,3})
      760     us = UnsafeSlice(a, 3)
        0     for j=1:size(a,2),i=1:size(a,1)
  8151120         us.start = (j-1)*320+i #size(a,1)+i
        -         #off = sub2ind(size(a), i, j, 1)
        0         update(us, 3, us.start)
        -     end
        0     a
        - end
        - function test_unsafe(n)
        0     a = zeros(Int, (320, 320, 320))
        -     # warmup
        0     setk_UnSafe(a);
        0     clear_malloc_data()
        -     #@time (
        0     for i=1:n; setk_UnSafe(a); end
        - end


On Sunday, April 19, 2015 at 2:21:56 PM UTC-6, Peter Brady wrote:
@Dahua, thanks for adding an unsafeview!  I appreciate how quickly this 
community responds.

I've added the following function to my test.jl script
function setk_unsafeview{T}(a::Array{T,3})
    for j=1:size(a,2),i=1:size(a,1)
        off = sub2ind(size(a), i, j, 1)
        update(unsafe_view(a, i, j, :), 3, off)
    end
    a
end
 But I'm not seeing the large increase in performance I was expecting.  My 
timings are now

julia test_all(5);
test_stride
elapsed time: 2.156173128 seconds (0 bytes allocated)
test_view
elapsed time: 9.30964534 seconds (94208000 bytes allocated, 0.47% gc time)
test_unsafe
elapsed time: 2.169307471 seconds (16303000 bytes allocated)
test_unsafeview
elapsed time: 8.955876793 seconds (90112000 bytes allocated, 0.41% gc time)

To be fair, I am cheating a bit with my custom 'UnsafeSlice' since I make only 
one instance and simply update the offset on each iteration.  If I make it 
immutable and create a new instance on every iteration (as I do for the view 
and unsafeview), things slow down a little and the allocation goes south:

julia test_all(5);
test_stride
elapsed time: 2.159909265 seconds (0 bytes allocated)
test_view
elapsed time: 9.029025282 seconds (94208000 bytes allocated, 0.43% gc time)
test_unsafe
elapsed time: 2.621667854 seconds (114606240 bytes allocated, 2.41% gc time)
test_unsafeview
elapsed time: 8.888434466 seconds (90112000 bytes allocated, 0.44% gc time)

These are all with 0.3.8-pre.  I'll try compiling master and see what happens.  
I'm still confused about why allocating a single type with a pointer, 2 ints 
and a tuple costs so much memory though.



On Sunday, April 19, 2015 at 11:38:17 AM UTC-6, Tim Holy wrote:
It's not just escape analysis, as this (new) issue demonstrates:
https://github.com/JuliaLang/julia/issues/10899

--Tim

On Sunday, April 19, 2015 12:33:51 PM Sebastian Good wrote:
 Their size seems much decreased. I’d imagine to totally avoid allocation in
 this benchmark requires an optimization that really has nothing to do with
 subarrays per se. You’d have to do an escape analysis and see that Aj never
 left sumcols. Not easy in practice, since it’s passed to slice and length,
 and you’d have to make sure they didn’t squirrel it away or pass it on to
 someone else. Then you could stack allocate

Re: [julia-users] zero cost subarray?

2015-04-19 Thread Sebastian Good
Their size seems much decreased. I’d imagine to totally avoid allocation in 
this benchmark requires an optimization that really has nothing to do with 
subarrays per se. You’d have to do an escape analysis and see that Aj never 
left sumcols. Not easy in practice, since it’s passed to slice and length, and 
you’d have to make sure they didn’t squirrel it away or pass it on to someone 
else. Then you could stack allocate it, or even destructure it into a bunch of 
scalar mutations on the stack. After eliminating dead code, you’d end up with a 
no-allocation loop much like you’d write by hand. This sort of optimization 
seems to be quite tricky for compilers to pull off, but it’s a common pattern 
in numerical code. 

In Julia is such cleverness left entirely to LLVM, or are there optimization 
passes in Julia itself?
On April 19, 2015 at 6:49:21 AM, Tim Holy (tim.h...@gmail.com) wrote:

Sorry to be slow to chime in here, but the tuple overhaul has landed and they  
are still not zero-cost:  

function sumcols(A)  
s = 0.0  
for j = 1:size(A,2)  
Aj = slice(A, :, j)  
for i = 1:length(Aj)  
s += Aj[i]  
end  
end  
s  
end  

Even in the latest 0.4, this still allocates memory. On the other hand, while  
SubArrays allocate nearly 2x more memory than ArrayViews, the speed of the two  
(replacing `slice` with `view` above) is, for me, nearly identical.  

--Tim  


On Friday, April 17, 2015 08:30:27 PM Sebastian Good wrote:  
 This was discussed a few weeks ago  
  
 https://groups.google.com/d/msg/julia-users/IxrvV8ABZoQ/uWZu5-IB3McJ  
  
 I think the bottom line is that the current implementation *should* be  
 'zero-cost' once a set of planned improvements and optimizations take  
 place. One of the key ones is a tuple overhaul.  
  
 Fair to say it can never be 'zero' cost since there is different inherent  
 overhead depending on the type of subarray, e.g. offset, slice,  
 re-dimension, etc. however the implementation is quite clever about  
 allowing specialization of those.  
  
 In a common case (e.g. a constant offset or simple stride) my understanding  
 is that the structure will be type-specialized and likely stack allocated  
 in many cases, reducing to what you'd write by hand. At least this is what  
 they're after.  
  
 On Friday, April 17, 2015 at 4:24:14 PM UTC-4, Peter Brady wrote:  
  Thanks for the links. I'll check out ArrayViews as it looks like what I  
  was going to do manually without wrapping it in a type.  
   
  By semi-dim agnostic I meant that the differencing algorithm itself only  
  cares about one dimension but that dimension is different for different  
  directions. Only a few toplevel routines actually need to know about the  
  dimensionality of the problem.  
   
  On Friday, April 17, 2015 at 2:04:39 PM UTC-6, René Donner wrote:  
  As far as I have measured it sub in 0.4 is still not cheap, as it  
  provides the flexibility to deal with all kinds of strides and offsets,  
  and  
  the view object itself thus has a certain size. See  
  https://github.com/rened/FunctionalData.jl#efficiency for a simple  
  analysis, where the speed is mostly dominated by the speed of the  
  sub-view mechanism.  
   
  To get faster views which require strides etc you can try  
  https://github.com/JuliaLang/ArrayViews.jl  
   
  What do you mean by semi-dim agnostic? In case you only need indexing  
  along the last dimension (like a[:,:,i] and a[:,:,:,i]) you can use  
   
  https://github.com/rened/FunctionalData.jl#efficient-views-details  
   
  which uses normal DenseArrays and simple pointer updates internally. It  
  can also update a view in-place, by just incrementing the pointer.  
   
  Am 17.04.2015 um 21:48 schrieb Peter Brady peter...@gmail.com:  
   Inorder to write some differencing algorithms in a semi-dimensional  
   
  agnostic manner the code I've written makes heavy use of subarrays which  
  turn out to be rather costly. I've noticed some posts on the cost of  
  subarrays here and that things will be better in 0.4. Can someone  
  comment  
  on how much better? Would subarray (or anything like it) be on par with  
  simply passing an offset and stride (constant) and computing the index  
  myself? I'm currently using the 0.3 release branch.  



Re: [julia-users] Re: Non-GPL Julia?

2015-04-17 Thread Sebastian Good
Certainly the licensing is important from a commercial aspect, but I think this 
is also an interesting discussion from a  core vs packages perspective. Python 
is separate from numpy, but indeed no one is under the illusion that they 
should work against any other sort of array package. Core linear algebra and 
array cleverness seems comfortable in Julia’s Base, but with so many different 
kinds of users of Julia — even just considering those doing math science with 
it — things like solvers and sparse arrays certainly feel like they could be in 
packages. Packages blessed by julialang, to be sure, but perhaps separate from 
the core.
On April 16, 2015 at 12:32:06 PM, Viral Shah (vi...@mayin.org) wrote:

The useful parts of SuiteSparse are all GPL. So, for a GPL-free build, it is 
straightforward to completely avoid using SuiteSparse.

One of the things I want is to have a version of Julia built with Intel 
compilers and linked to MKL. Julia can already use Intel's BLAS, LAPACK, LIBM, 
and FFT routines. There is also a VML package for vector math functions. The 
only big missing piece is sparse solvers - but perhaps that is ok for people, 
who can use Intel's sparse solvers or MUMPS or something else.

-viral

On Thursday, April 16, 2015 at 7:51:38 PM UTC+5:30, Isaiah wrote:
I recently annotated the license list to give myself (and others) a quick-look 
grasp of the license situation:

https://github.com/JuliaLang/julia/commit/d2ee85d1135fd801f1230530f39f05369f6384df

I agree with Tony that in the short-term, distributing a GPL-free binary 
ourselves is not a priority, but pull requests to make the situation clearer or 
to make a GPL-free build simpler would be fine. For example, there could be a 
NO_GPL Makefile variable, and a macro on the Julia side to annotate and 
selectively exclude GPL stuff from the system image (FFTW and Rmath should be, 
respectively, easy and very easy to exclude, however I'm not sure how deeply 
entangled the SuiteSparse stuff is).



On Thu, Apr 16, 2015 at 10:04 AM, Tony Kelman t...@kelman.net wrote:
It's certainly a long-term goal. 0.4 is far enough behind-schedule already that 
it's very unlikely to happen by then. Like most things in open source, it's 
limited by available labor. People who want to see it happen will need to help 
out if they want it to happen faster. For this particular issue of GPL 
dependencies, the most useful places to contribute would be helping set up 
BinDeps for the forked Rmath-julia library so it does not need to be built by 
base and Distributions.jl can still work well and be easy to install, and 
asking on the New DFT API pull request whether there are specific areas where 
Steven Johnson needs help - likely answers are benchmarking, conflict 
resolution to rebase to master, and setting up FFTW as a package with automatic 
BinDeps etc.

Removing things from Base has proven difficult to do smoothly, and while it 
will be necessary to slim down the mandatory runtime dependencies for 
embedding, static compilation, and less-restrictive licensing use cases, a lot 
of work still needs to be done to figure out how to manage code migrations in 
the least disruptive manner possible. I don't think this is the primary concern 
of any core Julia developers or contributors at the moment (in fact several 
people have said they would strongly prefer to not remove any other code from 
Base until after 0.4.0 is released, and I agree with that), but help and 
contributions are always welcome.


On Wednesday, April 15, 2015 at 6:51:44 AM UTC-7, Sebastian Good wrote:
Is producing a non-GPL Julia build still on the radar? It might be a nice goal 
for the 0.4 release, even if we have to build it ourselves (e.g. against MKL, 
etc.)

On Monday, April 21, 2014 at 5:00:47 PM UTC-4, Steven G. Johnson wrote:


On Monday, April 21, 2014 4:40:38 PM UTC-4, Tobias Knopp wrote:
Yes this is awesome work you have done there. Do you plan to implement the 
real-data FFT, DCT and DST in pure Julia also? Then one could really think 
about moving FFTW into a package. Hopefully its author is ok with that ;-)

I plan to implement real-data FFTs, and move FFTW into a package.

Pure-Julia DCT and DST are not in my immediate plans (they are a PITA to do 
right because there are 16 types, of which 8 are common); my feeling is that 
the need for these is uncommon enough that it's not terrible to have these in a 
package instead of in Base.    (Hadamard transforms and MDCTs are also 
currently in packages.)



Re: [julia-users] Re: Non-GPL Julia?

2015-04-17 Thread Sebastian Good
Scott: to save anyone else the trouble of saying the same thing: the best way 
to achieve this will be to roll up your sleeves and help take care of it 
yourself. :-D

Viral - I’m happy to try and spend some extra cycles getting Julia to compile 
with the Intel tool suite if that helps start to cut the knot. I have the 
licenses and it could be a useful way to learn a bit more about the core. Who 
is the right person to bug as I figure that out?
On April 17, 2015 at 11:33:51 AM, Scott Jones (scott.paul.jo...@gmail.com) 
wrote:

Yes, but I need a solution to keeping things GPL free (LGPL would be fine) in 
the short-term (say, in the next 3-6 months maybe), not long-term.
For anybody contemplating using Julia for commercial projects, this seems like 
a critical issue...

If there is some way of building a Julia executable without any of the GPL 
code, that will still run as long as you don't use FFTW, SUITESPARSE, or RMATH,
then that would be fine.

It does seem that Julia has had the (mathematical) everything including the 
kitchen sink thrown in (although decimal floating point is a surprising 
omission),
and I hate to see that limiting its potential use because of licensing issues...

Scott

On Friday, April 17, 2015 at 11:16:36 AM UTC-4, Isaiah wrote:
Packages blessed by julialang, to be sure, but perhaps separate from the core.

Yes, there is a broad consensus that this will be the long-term direction.
https://github.com/JuliaLang/julia/issues/5155


On Fri, Apr 17, 2015 at 11:08 AM, Sebastian Good 
seba...@palladiumconsulting.com wrote:
Certainly the licensing is important from a commercial aspect, but I think this 
is also an interesting discussion from a  core vs packages perspective. Python 
is separate from numpy, but indeed no one is under the illusion that they 
should work against any other sort of array package. Core linear algebra and 
array cleverness seems comfortable in Julia’s Base, but with so many different 
kinds of users of Julia — even just considering those doing math science with 
it — things like solvers and sparse arrays certainly feel like they could be in 
packages. Packages blessed by julialang, to be sure, but perhaps separate from 
the core.
On April 16, 2015 at 12:32:06 PM, Viral Shah (vi...@mayin.org) wrote:

The useful parts of SuiteSparse are all GPL. So, for a GPL-free build, it is 
straightforward to completely avoid using SuiteSparse.

One of the things I want is to have a version of Julia built with Intel 
compilers and linked to MKL. Julia can already use Intel's BLAS, LAPACK, LIBM, 
and FFT routines. There is also a VML package for vector math functions. The 
only big missing piece is sparse solvers - but perhaps that is ok for people, 
who can use Intel's sparse solvers or MUMPS or something else.

-viral

On Thursday, April 16, 2015 at 7:51:38 PM UTC+5:30, Isaiah wrote:
I recently annotated the license list to give myself (and others) a quick-look 
grasp of the license situation:

https://github.com/JuliaLang/julia/commit/d2ee85d1135fd801f1230530f39f05369f6384df

I agree with Tony that in the short-term, distributing a GPL-free binary 
ourselves is not a priority, but pull requests to make the situation clearer or 
to make a GPL-free build simpler would be fine. For example, there could be a 
NO_GPL Makefile variable, and a macro on the Julia side to annotate and 
selectively exclude GPL stuff from the system image (FFTW and Rmath should be, 
respectively, easy and very easy to exclude, however I'm not sure how deeply 
entangled the SuiteSparse stuff is).



On Thu, Apr 16, 2015 at 10:04 AM, Tony Kelman to...@kelman.net wrote:
It's certainly a long-term goal. 0.4 is far enough behind-schedule already that 
it's very unlikely to happen by then. Like most things in open source, it's 
limited by available labor. People who want to see it happen will need to help 
out if they want it to happen faster. For this particular issue of GPL 
dependencies, the most useful places to contribute would be helping set up 
BinDeps for the forked Rmath-julia library so it does not need to be built by 
base and Distributions.jl can still work well and be easy to install, and 
asking on the New DFT API pull request whether there are specific areas where 
Steven Johnson needs help - likely answers are benchmarking, conflict 
resolution to rebase to master, and setting up FFTW as a package with automatic 
BinDeps etc.

Removing things from Base has proven difficult to do smoothly, and while it 
will be necessary to slim down the mandatory runtime dependencies for 
embedding, static compilation, and less-restrictive licensing use cases, a lot 
of work still needs to be done to figure out how to manage code migrations in 
the least disruptive manner possible. I don't think this is the primary concern 
of any core Julia developers or contributors at the moment (in fact several 
people have said they would strongly prefer to not remove any other code from 
Base until after 0.4.0

Re: [julia-users] zero cost subarray?

2015-04-17 Thread Sebastian Good
This was discussed a few weeks ago

https://groups.google.com/d/msg/julia-users/IxrvV8ABZoQ/uWZu5-IB3McJ

I think the bottom line is that the current implementation *should* be 
'zero-cost' once a set of planned improvements and optimizations take 
place. One of the key ones is a tuple overhaul.

Fair to say it can never be 'zero' cost since there is different inherent 
overhead depending on the type of subarray, e.g. offset, slice, 
re-dimension, etc. however the implementation is quite clever about 
allowing specialization of those.

In a common case (e.g. a constant offset or simple stride) my understanding 
is that the structure will be type-specialized and likely stack allocated 
in many cases, reducing to what you'd write by hand. At least this is what 
they're after.

On Friday, April 17, 2015 at 4:24:14 PM UTC-4, Peter Brady wrote:

 Thanks for the links.  I'll check out ArrayViews as it looks like what I 
 was going to do manually without wrapping it in a type.

 By semi-dim agnostic I meant that the differencing algorithm itself only 
 cares about one dimension but that dimension is different for different 
 directions. Only a few toplevel routines actually need to know about the 
 dimensionality of the problem. 

 On Friday, April 17, 2015 at 2:04:39 PM UTC-6, René Donner wrote:

 As far as I have measured it sub in 0.4 is still not cheap, as it 
 provides the flexibility to deal with all kinds of strides and offsets, and 
 the view object itself thus has a certain size. See 
 https://github.com/rened/FunctionalData.jl#efficiency for a simple 
 analysis, where the speed is mostly dominated by the speed of the 
 sub-view mechanism. 

 To get faster views which require strides etc you can try 
 https://github.com/JuliaLang/ArrayViews.jl 

 What do you mean by semi-dim agnostic? In case you only need indexing 
 along the last dimension (like a[:,:,i] and a[:,:,:,i]) you can use 
   https://github.com/rened/FunctionalData.jl#efficient-views-details 
 which uses normal DenseArrays and simple pointer updates internally. It 
 can also update a view in-place, by just incrementing the pointer. 



 Am 17.04.2015 um 21:48 schrieb Peter Brady peter...@gmail.com: 

  Inorder to write some differencing algorithms in a semi-dimensional 
 agnostic manner the code I've written makes heavy use of subarrays which 
 turn out to be rather costly. I've noticed some posts on the cost of 
 subarrays here and that things will be better in 0.4.  Can someone comment 
 on how much better?  Would subarray (or anything like it) be on par with 
 simply passing an offset and stride (constant) and computing the index 
 myself? I'm currently using the 0.3 release branch. 



[julia-users] Re: Non-GPL Julia?

2015-04-15 Thread Sebastian Good
Is producing a non-GPL Julia build still on the radar? It might be a nice 
goal for the 0.4 release, even if we have to build it ourselves (e.g. 
against MKL, etc.)

On Monday, April 21, 2014 at 5:00:47 PM UTC-4, Steven G. Johnson wrote:



 On Monday, April 21, 2014 4:40:38 PM UTC-4, Tobias Knopp wrote:

 Yes this is awesome work you have done there. Do you plan to implement 
 the real-data FFT, DCT and DST in pure Julia also? Then one could really 
 think about moving FFTW into a package. Hopefully its author is ok with 
 that ;-) 


 I plan to implement real-data FFTs, and move FFTW into a package.

 Pure-Julia DCT and DST are not in my immediate plans (they are a PITA to 
 do right because there are 16 types, of which 8 are common); my feeling is 
 that the need for these is uncommon enough that it's not terrible to have 
 these in a package instead of in Base.(Hadamard transforms and MDCTs 
 are also currently in packages.)



Re: [julia-users] merged arrays?

2015-04-09 Thread Sebastian Good
Thanks! The primary work is either quick previewing of or format translation of 
large numbers of very large files. We want reasonable random access or 
sequential access speed to what may be terabytes of data. Cleverness with block 
buffering, i/o, compression, etc. are all games for another part of the system 
and so I’m trying to just use a terrifically naive mmap strategy. It’s worked 
great with one file, and now I’d like to see if I can export a 2D array 
representing the data in lots of files. A few hours of mmap trickery gets me a 
long way to being able to write boring, understandable code against this data 
while I wrangle it into a more suitable form. 
On April 9, 2015 at 4:38:16 AM, Tim Holy (tim.h...@gmail.com) wrote:

https://github.com/tanmaykm/ChainedVectors.jl  

--Tim  

On Wednesday, April 08, 2015 09:24:03 PM Kevin Squire wrote:  
 AFAIK, there's nothing really like that right now, but what do you plan to  
 do with the data? Most linear algebra code, for example, calls out to  
 BLAS, which requires data to be contiguous (or at least strided) in  
 memory. Other code may or may not have this same restriction.  
  
 It should be relatively easy to write an `VCat` type which wraps a variable  
 sized list of arrays and at least lets you index them as if they were one  
 array. Making it efficient would probably be more challenging, and making  
 it an actual AbstractArray might be tedious, but should be doable.  
  
 Cheers!  
 Kevin  
  
 On Wed, Apr 8, 2015 at 8:56 PM, Sebastian Good   
  
 sebast...@palladiumconsulting.com wrote:  
  I can use sub to pretend a larger array is a smaller, or differently  
  shaped one. Is there functionality to allow me to treat several smaller  
  arrays as a larger one without copying them? In effect, mimicking hcat  
  and friends, but without copying data.  



[julia-users] merged arrays?

2015-04-08 Thread Sebastian Good
I can use sub to pretend a larger array is a smaller, or differently shaped 
one. Is there functionality to allow me to treat several smaller arrays as 
a larger one without copying them? In effect, mimicking hcat and friends, 
but without copying data.


[julia-users] Re: [ANN] JuliaIO and FileIO

2015-04-06 Thread Sebastian Good
Very interesting project. I've worked a little on problems like this in my 
projects and have found that filename extensions aren't always enough to 
distinguish filetypes. It's not uncommon to have to scan the first few 
(hundred) bytes looking for distinctive patterns, magic cookies, etc. to 
determine just what type a file is. Do you anticipate having a registry of 
canonical file types, e.g. related to MIME types if available?

On Saturday, April 4, 2015 at 11:41:14 AM UTC-4, Simon Danisch wrote:

 Hi there,

 FileIO has the aim to make it very easy to read any arbitrary file.
 I hastily copied together a proof of concept by taking code from Images.jl.

 JuliaIO is the umbrella group, which takes IO packages with no home. If 
 someone wrote an IO package, but doesn't have time to implement the FileIO 
 interface, giving it to JuliaIO might be a good idea in order to keep the 
 package usable.

 Concept of FileIO is described in the readme:

 Meta package for FileIO. Purpose is to open a file and return the 
 respective Julia object, without doing any research on how to open the file.

 f = filetest.jpg # - File{:jpg}read(f) # - Imageread(filetest.obj) # - 
 Meshread(filetest.csv) # - DataFrame

 So far only Images are supported and MeshIO is on the horizon.

 It is structured the following way: There are three levels of abstraction, 
 first FileIO, defining the file_str macro etc, then a meta package for a 
 certain class of file, e.g. Images or Meshes. This meta package defines the 
 Julia datatype (e.g. Mesh, Image) and organizes the importer libraries. 
 This is also a good place to define IO library independant tests for 
 different file formats. Then on the last level, there are the low-level 
 importer libraries, which do the actual IO. They're included via Mike Innes 
 Requires https://github.com/one-more-minute/Requires.jl package, so 
 that it doesn't introduce extra load time if not needed. This way, using 
 FileIO without reading/writing anything should have short load times.

 As an implementation example please look at FileIO - ImageIO - 
 ImageMagick. This should already work as a proof of concept. Try:

 using FileIO # should be very fast, thanks to Mike Innes Requires 
 packageread(filetest.jpg) # takes a little longer as it needs to load the 
 IO libraryread(filetest.jpg) # should be fastread(File(documents, 
 images, myimage.jpg) # automatic joinpath via File constructor

 Please open issues if things are not clear or if you find flaws in the 
 concept/implementation.

 If you're interested in working on this infrastructure I'll be pleased to 
 add you to the group JuliaIO.


 Best,

 Simon



Re: [julia-users] SubArray memory footprint

2015-03-29 Thread Sebastian Good
Makes sense. Was just thinking naively of the start and next functions
using tuples, as they would know the proper dimensions etc but I bet this
is what Cartesian index does and now I should look at it :-)

On Saturday, March 28, 2015, Tim Holy tim.h...@gmail.com wrote:

 Right now getindex(::AbstractArray, ::Tuple) isn't defined (though there's
 a
 proposal to define it as one of several options to solve a tricky problem,
 see
 https://github.com/JuliaLang/julia/pull/10525#issuecomment-84597488).

 More importantly, tuples don't have +, -, min, and max defined for them,
 and
 it's not obvious they should. But all that is supported by CartesianIndex.

 --Tim

 On Saturday, March 28, 2015 07:10:50 AM Sebastian Good wrote:
  Random thought: If tuples can avoid boxing why not return tuples from the
  iterator protocol?
 
  On Friday, March 27, 2015, Matt Bauman mbau...@gmail.com javascript:;
 wrote:
   On Friday, March 27, 2015 at 8:21:10 AM UTC-4, Sebastian Good wrote:
   Forgive my ignorance, but what is Cartesian indexing?
  
   There are two ways to iterate over all elements of an array: Linear
   indexing and Cartesian indexing. For example, given a 2x3 matrix,
 linear
   indexing would use just one index from 1:6, whereas Cartesian indexing
   specifies indices for both dimensions: (1,1), (1,2), (2,1), ...
  
  
   If an array isn't stored continuously in memory for linear indexing,
   converting to the Cartesian indices is very expensive (because it
 requires
   integer division, which is a surprising slow). The new `eachindex`
 method
   in 0.4 returns an iterator to go over all the Cartesian indices very
   quickly.



-- 
*Sebastian Good*


Re: [julia-users] SubArray memory footprint

2015-03-28 Thread Sebastian Good
Random thought: If tuples can avoid boxing why not return tuples from the
iterator protocol?

On Friday, March 27, 2015, Matt Bauman mbau...@gmail.com wrote:

 On Friday, March 27, 2015 at 8:21:10 AM UTC-4, Sebastian Good wrote:

 Forgive my ignorance, but what is Cartesian indexing?


 There are two ways to iterate over all elements of an array: Linear
 indexing and Cartesian indexing. For example, given a 2x3 matrix, linear
 indexing would use just one index from 1:6, whereas Cartesian indexing
 specifies indices for both dimensions: (1,1), (1,2), (2,1), ...


 If an array isn't stored continuously in memory for linear indexing,
 converting to the Cartesian indices is very expensive (because it requires
 integer division, which is a surprising slow). The new `eachindex` method
 in 0.4 returns an iterator to go over all the Cartesian indices very
 quickly.



-- 
*Sebastian Good*


Re: [julia-users] SubArray memory footprint

2015-03-27 Thread Sebastian Good
Forgive my ignorance, but what is Cartesian indexing?

On Friday, March 27, 2015, Tim Holy tim.h...@gmail.com wrote:

 Nice catch! It seems clear that not all algorithms have been updated to use
 fast iteration, and this is one of them. In contrast, compare
 s = sub(a, 1:10_000, 1:10_000)
 @time sum(s, (1,2))
 which you'll see is just as fast for an array.

 It all comes down to the iteration paradigm; it's not possible to make
 linear
 indexing fast for all array types, so for SubArrays one needs to use
 cartesian
 iteration. There are probably more algorithms in base that need to be
 modified.
 Can you file an issue with any you notice?

 Best,
 --Tim

 On Thursday, March 26, 2015 10:48:31 PM Sebastian Good wrote:
  Sorry, didn’t realize the conversation had wandered off-list. Thanks for
  correcting that.  I’ll check the type-stability and return with some
  benchmarks. I’ve got a julia build as of yesterday showing about a 5x
  decrease over native memory access which surprised me a little bit.
 
  julia const a = rand(10_000, 10_000);
 
  (after a few repetitions…)
 
  julia @time mean(a)
  elapsed time: 0.050039199 seconds (184 bytes allocated)
  0.572164856915
 
  julia @time mean(sub(a, 1:10_000, 1:10_000));
  elapsed time: 2.349825138 seconds (584 bytes allocated)
 
  Here we’re about 5x slower. In my larger case we’re about 50x slower,
 but I
  will check there for type instability.
 
 
 
  On March 26, 2015 at 5:13:21 PM, Tim Holy (tim.h...@gmail.com
 javascript:;) wrote:
 
  It's better to keep these conversations on the list (CCed).
 
  I can't tell from what you've shown whether you're talking about creation
  performance or usage performance. For most cases, usage (indexing)
  performance should be nearly at the level of Arrays, if you believe the
  benchmarks
  (https://github.com/JuliaLang/julia/pull/8501#issuecomment-63232821,
  https://github.com/JuliaLang/julia/pull/9329,
  https://github.com/JuliaLang/julia/pull/10507). Are you sure you're
 using a
  recent 0.4 checkout? You don't have a type-stability problem, do you?
 
  --Tim
 
  On Thursday, March 26, 2015 03:44:13 PM you wrote:
   Now that I’ve used a few in anger, they also seem slow in v0.4, like
 50x
   slower than regular arrays. Is this down to bounds checking?
  
   My case is simple, I have data that looks like
  
   ..header..|..data[n]..|..header..|..data[n]..|...
  
   So all I have to do to expose the data as a 2D array is
   sub(header_len+1:header_len+n, 1:num_records) I’d think it would
   specialize
   very fast since there is still contiguous access in the first
 dimension,
   and a contiguous stride.
  
   And it’s good enough to get working, but I was a bit surprised. Any
   suggestions?
  
   On March 25, 2015 at 4:48:41 PM, Sebastian Good
   (sebast...@palladiumconsulting.com javascript:;) wrote:
  
   I think I’d have to know a lot more about Julia to have a reasonable
   contribution in those murky waters!
  
   On March 25, 2015 at 4:07:19 PM, Tim Holy (tim.h...@gmail.com
 javascript:;) wrote:
  
   On Wednesday, March 25, 2015 03:47:55 PM you wrote:
I guess it’s at that point that getindex
with ranges will return SubArrays, i.e. mutable views, instead of
copies?
Is that still targeted for v0.4?
  
   In my opinion, that depends mostly on whether we settle on a solution
 to
   https://github.com/JuliaLang/julia/pull/8227 in time. Help wanted :-).
  
   --Tim
  
On March 25, 2015 at 3:30:03 PM, Tim Holy (tim.h...@gmail.com
 javascript:;) wrote:
   
SubArrays are immutable on 0.4. But tuples aren't inlined, which is
going
to force allocation.
   
Assuming you're using 0.3, there's a second problem: the code in the
constructor is not type-stable, and that makes construction slow and
memory- hungry. Compare the following on 0.3 and 0.4:
   
julia A = rand(2,10^4);
   
julia function myfun(A)
s = 0.0
for j = 1:size(A,2)
S = slice(A, :, j)
s += sum(S)
end
s
end
myfun (generic function with 1 method)
   
   
On 0.3:
# warmup call
julia @time myfun(A)
elapsed time: 0.145141435 seconds (11277536 bytes allocated)
   
# the real call
julia @time myfun(A)
elapsed time: 0.034556106 seconds (7866896 bytes allocated)
   
   
On 0.4:
julia @time myfun(A)
elapsed time: 0.190744146 seconds (7 MB allocated)
   
julia @time myfun(A)
elapsed time: 0.000697173 seconds (1 MB allocated)
   
   
   
So you can see it's about 50x faster and about 8-fold more memory
efficient
on 0.4. Once Jeff finishes his tuple overhaul, the allocation on 0.4
could
potentially drop to 0.
   
--Tim
   
On Wednesday, March 25, 2015 11:18:08 AM Sebastian Good wrote:
 I was surprised by two things in the SubArray implementation

 1) They are big! About 175 bytes for a simple subset from a 1D
 array
 from
 my naive measurement.[*]
 2) They are not flat

Re: [julia-users] SubArray memory footprint

2015-03-27 Thread Sebastian Good
Ah, excellent, that makes sense. In practice this is how we'll be doing it
anyway.

On Friday, March 27, 2015, Matt Bauman mbau...@gmail.com wrote:

 On Friday, March 27, 2015 at 8:21:10 AM UTC-4, Sebastian Good wrote:

 Forgive my ignorance, but what is Cartesian indexing?


 There are two ways to iterate over all elements of an array: Linear
 indexing and Cartesian indexing. For example, given a 2x3 matrix, linear
 indexing would use just one index from 1:6, whereas Cartesian indexing
 specifies indices for both dimensions: (1,1), (1,2), (2,1), ...


 If an array isn't stored continuously in memory for linear indexing,
 converting to the Cartesian indices is very expensive (because it requires
 integer division, which is a surprising slow). The new `eachindex` method
 in 0.4 returns an iterator to go over all the Cartesian indices very
 quickly.



-- 
*Sebastian Good*


Re: [julia-users] SubArray memory footprint

2015-03-27 Thread Sebastian Good
Will do!

On Friday, March 27, 2015, Tim Holy tim.h...@gmail.com wrote:

 Thanks for chiming in, Matt.

 I should have added that there are _some_ SubArrays that do have efficient
 linear indexing: sub(A, :, 3:5) and sub(A, 2, :), for a matrix A, are two
 examples. (The latter in particular is one of the advantages of 0.4's
 SubArrays over ArrayViews.) But in general it's not efficient.

 But Sebastian, please do file those issues. It's hard to keep track of what
 needs updating, and issues are vastly preferable to posts to julia-users.
 For
 instance, it's already gone clean out of my head what function was slow
 with
 SubArrays :-).

 --Tim

 On Friday, March 27, 2015 10:58:40 AM you wrote:
  Ah, excellent, that makes sense. In practice this is how we'll be doing
 it
  anyway.
 
  On Friday, March 27, 2015, Matt Bauman mbau...@gmail.com javascript:;
 wrote:
   On Friday, March 27, 2015 at 8:21:10 AM UTC-4, Sebastian Good wrote:
   Forgive my ignorance, but what is Cartesian indexing?
  
   There are two ways to iterate over all elements of an array: Linear
   indexing and Cartesian indexing. For example, given a 2x3 matrix,
 linear
   indexing would use just one index from 1:6, whereas Cartesian indexing
   specifies indices for both dimensions: (1,1), (1,2), (2,1), ...
  
  
   If an array isn't stored continuously in memory for linear indexing,
   converting to the Cartesian indices is very expensive (because it
 requires
   integer division, which is a surprising slow). The new `eachindex`
 method
   in 0.4 returns an iterator to go over all the Cartesian indices very
   quickly.



-- 
*Sebastian Good*


Re: [julia-users] Re: zero-allocation reinterpretation of bytes

2015-03-26 Thread Sebastian Good
Thanks for this addition, Jameson. I have found it maddeningly difficult to 
establish just what’s being generated in the larger program. Small benchmarks 
show the load being generated without a function call, and larger ones show 
much more complicated behavior. It certainly doesn’t seem like the two ways of 
doing pointer arithmetic should generate code any differently. Since I’m 
running on v0.4, perhaps it is just limitations in the optimization or type 
inference process that will get cleaned up as major features land in trunk.

Reading from a stream is certainly the best way around this sort of silliness 
when you can manage it, but I’m trying to read from a memory mapped file to 
avoid making copies of data (at least at this stage in the pipeline)

On March 26, 2015 at 12:24:10 AM, Jameson Nash (vtjn...@gmail.com) wrote:

 Given the performance difference and the different behavior, I'm tempted to 
 just deprecate the two-argument form of pointer.

let's try to be aware of the fact that there is is no performance difference, 
before we throw out any wild claims about function calls being problematic or 
slow:

julia g(x) = for i = 1:1e6 pointer(x,12) end
g (generic function with 1 method)

julia h(x) = for i = 1:1e6 pointer(x)+12*sizeof(x) end
h (generic function with 1 method)

julia @time g(Int8[])
elapsed time: 0.451235329 seconds (144 bytes allocated)

julia @time h(Int8[])
elapsed time: 0.450592699 seconds (144 bytes allocated)

 There's a branch in eltype, which is probably causing this difference.

That branch is of the form `if true`, so it will get optimized away. (there is 
a performance gap still to calling sizeof, but it stems from a current 
limitation of the julia codegen/inference, and not anything major)

 To more closely follow the principle of pointer arithmetic long ago 
established by C

C needed to define pointer arithmetic to be equivalent to array access, because 
it decided that `a[x]` was defined to be just syntactic sugar for `*(a+x)`. I 
don't see how that is really a feature, since it throws away perfectly good 
syntax and instead gives you something harder to use. So instead, Julia defines 
math-like operations to generally work like math (so x+1 gives you the pointer 
to the next byte), and array-like operations work like array operations (so 
unsafe_load, pointer, getindex, pointer_to_array, etc. all operate based on 
elements). FWIW though, Wikipedia seems to note that most languages don't 
define pointer arithmetic at all: 
http://en.wikipedia.org/wiki/Pointer_(computer_programming)

For your purposes, I believe you should be able to dispense with pointers 
entirely by reading the data from a file (or IOBuffer) and using StrPack.jl to 
deal with any specific alignment issues you may encounter.

On Wed, Mar 25, 2015 at 9:07 AM Stefan Karpinski ste...@karpinski.org wrote:
Given the performance difference and the different behavior, I'm tempted to 
just deprecate the two-argument form of pointer.

On Wed, Mar 25, 2015 at 12:53 PM, Sebastian Good 
sebast...@palladiumconsulting.com wrote:
I guess what I find most confusing is that there would be a difference, since 
adding 1 to a pointer only adds one byte, not one element size.

 p1 = pointer(zeros(UInt64));
Ptr{UInt64} @0x00010b28c360
 p1 + 1
Ptr{UInt64} @0x00010b28c361

I would have expected the latter to end in 68. the two argument pointer 
function gets this “right”. 

 a=zeros(UInt64);
 pointer(a,1)
Ptr{Int64} @0x00010b9c72e0
 pointer(a,2)
Ptr{Int64} @0x00010b9c72e8

I can see arguments multiple ways, but when I’m given a strongly typed pointer 
(Ptr{T}), I would expect it to participate in arithmetic in increments of 
sizeof(T).

On March 25, 2015 at 6:36:37 AM, Stefan Karpinski (ste...@karpinski.org) wrote:

That does seem to be the issue. It's tricky to fix since you can't evaluate 
sizeof(Ptr) unless the condition is true.

On Tue, Mar 24, 2015 at 7:13 PM, Stefan Karpinski ste...@karpinski.org wrote:
There's a branch in eltype, which is probably causing this difference.

On Tue, Mar 24, 2015 at 7:00 PM, Sebastian Good 
sebast...@palladiumconsulting.com wrote:
Yep, that’s done it. The only difference I can see in the code I wrote before 
and this code is that previously I had

convert(Ptr{T}, pointer(raw, byte_number))

whereas here we have

convert(Ptr{T}, pointer(raw) + byte_number - 1)

The former construction seems to emit a call to a Julia-intrinsic function, 
while the latter executes the more expected simple machine loads. Is there a 
subtle difference between the two calls to pointer?

Thanks all for your help!

On March 24, 2015 at 12:19:00 PM, Matt Bauman (mbau...@gmail.com) wrote:

(The key is to ensure that the method gets specialized for different types with 
the parametric `::Type{T}` in the signature instead of `T::DataType`).

On Tuesday, March 24, 2015 at 12:10:59 PM UTC-4, Stefan Karpinski wrote:
This seems like it works fine to me (on both 0.3 and 0.4):

immutable Test
x

Re: [julia-users] SubArray memory footprint

2015-03-26 Thread Sebastian Good
Sorry, didn’t realize the conversation had wandered off-list. Thanks for 
correcting that.  I’ll check the type-stability and return with some 
benchmarks. I’ve got a julia build as of yesterday showing about a 5x decrease 
over native memory access which surprised me a little bit.

julia const a = rand(10_000, 10_000);

(after a few repetitions…)

julia @time mean(a)
elapsed time: 0.050039199 seconds (184 bytes allocated)
0.572164856915

julia @time mean(sub(a, 1:10_000, 1:10_000));
elapsed time: 2.349825138 seconds (584 bytes allocated)

Here we’re about 5x slower. In my larger case we’re about 50x slower, but I 
will check there for type instability.



On March 26, 2015 at 5:13:21 PM, Tim Holy (tim.h...@gmail.com) wrote:

It's better to keep these conversations on the list (CCed).  

I can't tell from what you've shown whether you're talking about creation  
performance or usage performance. For most cases, usage (indexing) performance  
should be nearly at the level of Arrays, if you believe the benchmarks  
(https://github.com/JuliaLang/julia/pull/8501#issuecomment-63232821,  
https://github.com/JuliaLang/julia/pull/9329,  
https://github.com/JuliaLang/julia/pull/10507). Are you sure you're using a  
recent 0.4 checkout? You don't have a type-stability problem, do you?  

--Tim  

On Thursday, March 26, 2015 03:44:13 PM you wrote:  
 Now that I’ve used a few in anger, they also seem slow in v0.4, like 50x  
 slower than regular arrays. Is this down to bounds checking?  
  
 My case is simple, I have data that looks like  
  
 ..header..|..data[n]..|..header..|..data[n]..|...  
  
 So all I have to do to expose the data as a 2D array is  
 sub(header_len+1:header_len+n, 1:num_records) I’d think it would specialize  
 very fast since there is still contiguous access in the first dimension,  
 and a contiguous stride.  
  
 And it’s good enough to get working, but I was a bit surprised. Any  
 suggestions?  
  
 On March 25, 2015 at 4:48:41 PM, Sebastian Good  
 (sebast...@palladiumconsulting.com) wrote:  
  
 I think I’d have to know a lot more about Julia to have a reasonable  
 contribution in those murky waters!  
  
 On March 25, 2015 at 4:07:19 PM, Tim Holy (tim.h...@gmail.com) wrote:  
  
 On Wednesday, March 25, 2015 03:47:55 PM you wrote:  
  I guess it’s at that point that getindex  
  with ranges will return SubArrays, i.e. mutable views, instead of copies?  
  Is that still targeted for v0.4?  
  
 In my opinion, that depends mostly on whether we settle on a solution to  
 https://github.com/JuliaLang/julia/pull/8227 in time. Help wanted :-).  
  
 --Tim  
  
  On March 25, 2015 at 3:30:03 PM, Tim Holy (tim.h...@gmail.com) wrote:  
   
  SubArrays are immutable on 0.4. But tuples aren't inlined, which is going  
  to force allocation.  
   
  Assuming you're using 0.3, there's a second problem: the code in the  
  constructor is not type-stable, and that makes construction slow and  
  memory- hungry. Compare the following on 0.3 and 0.4:  
   
  julia A = rand(2,10^4);  
   
  julia function myfun(A)  
  s = 0.0  
  for j = 1:size(A,2)  
  S = slice(A, :, j)  
  s += sum(S)  
  end  
  s  
  end  
  myfun (generic function with 1 method)  
   
   
  On 0.3:  
  # warmup call  
  julia @time myfun(A)  
  elapsed time: 0.145141435 seconds (11277536 bytes allocated)  
   
  # the real call  
  julia @time myfun(A)  
  elapsed time: 0.034556106 seconds (7866896 bytes allocated)  
   
   
  On 0.4:  
  julia @time myfun(A)  
  elapsed time: 0.190744146 seconds (7 MB allocated)  
   
  julia @time myfun(A)  
  elapsed time: 0.000697173 seconds (1 MB allocated)  
   
   
   
  So you can see it's about 50x faster and about 8-fold more memory  
  efficient  
  on 0.4. Once Jeff finishes his tuple overhaul, the allocation on 0.4 could  
  potentially drop to 0.  
   
  --Tim  
   
  On Wednesday, March 25, 2015 11:18:08 AM Sebastian Good wrote:  
   I was surprised by two things in the SubArray implementation  

   1) They are big! About 175 bytes for a simple subset from a 1D array  
   from  
   my naive measurement.[*]  
   2) They are not flat. That is, they seem to get heap allocated and have  
   indirections in them.  

   I'm guessing this is because SubArrays aren't immutable, and tuples  
   aren't  
   always inlined into an immutable either, but I am really grasping at  
   straws.  

   I'm walking through a very large memory mapped structure and generating  
   hundreds of thousands of subarrays to look at various windows of it. I  
   was  
   hoping that by using views I would reduce memory usage as compared with  
   creating copies of those windows. Indeed I am, but by a lot less than I  
   thought I would be.  

   In other words: SubArrays are surprisingly expensive because they  
   necessitate several memory allocations apiece.  

   From the work that's gone into SubArrays I'm guessing that isn't meant  
   to  
   be. They are so carefully

Re: [julia-users] Re: zero-allocation reinterpretation of bytes

2015-03-25 Thread Sebastian Good
The benefit of the semantics of the two argument pointer function is that it 
preserves intuitive pointer arithmetic. As a new (yet happy!) Julia programmer, 
I certainly don’t know what the deprecation implications of changing pointer 
arithmetic are (vast, sadly, I imagine), but their behavior certainly violated 
my “principle of least astonishment” when I found they worked by bytes, not by 
Ts. That is, instead of base/pointer.jl:64 (and friends) looking like

+(x::Ptr, y::Integer) = oftype(x, (UInt(x) + (y % UInt) % UInt))

I would expect them to look like

+{T}(x::Ptr{T}, y::Integer) = oftype(x, (UInt(x) + sizeof(T)*(y % UInt) % UInt))

To more closely follow the principle of pointer arithmetic long ago established 
by C. The type specialization would make these just as fast. For this to work 
with arrays safely, you’d have to guarantee that dense arrays had no padding 
between elements. Since C requires this to the be the case, it seems we’re on 
safe ground?
On March 25, 2015 at 9:07:40 AM, Stefan Karpinski (ste...@karpinski.org) wrote:

Given the performance difference and the different behavior, I'm tempted to 
just deprecate the two-argument form of pointer.

On Wed, Mar 25, 2015 at 12:53 PM, Sebastian Good 
sebast...@palladiumconsulting.com wrote:
I guess what I find most confusing is that there would be a difference, since 
adding 1 to a pointer only adds one byte, not one element size.

 p1 = pointer(zeros(UInt64));
Ptr{UInt64} @0x00010b28c360
 p1 + 1
Ptr{UInt64} @0x00010b28c361

I would have expected the latter to end in 68. the two argument pointer 
function gets this “right”. 

 a=zeros(UInt64);
 pointer(a,1)
Ptr{Int64} @0x00010b9c72e0
 pointer(a,2)
Ptr{Int64} @0x00010b9c72e8

I can see arguments multiple ways, but when I’m given a strongly typed pointer 
(Ptr{T}), I would expect it to participate in arithmetic in increments of 
sizeof(T).

On March 25, 2015 at 6:36:37 AM, Stefan Karpinski (ste...@karpinski.org) wrote:

That does seem to be the issue. It's tricky to fix since you can't evaluate 
sizeof(Ptr) unless the condition is true.

On Tue, Mar 24, 2015 at 7:13 PM, Stefan Karpinski ste...@karpinski.org wrote:
There's a branch in eltype, which is probably causing this difference.

On Tue, Mar 24, 2015 at 7:00 PM, Sebastian Good 
sebast...@palladiumconsulting.com wrote:
Yep, that’s done it. The only difference I can see in the code I wrote before 
and this code is that previously I had

convert(Ptr{T}, pointer(raw, byte_number))

whereas here we have

convert(Ptr{T}, pointer(raw) + byte_number - 1)

The former construction seems to emit a call to a Julia-intrinsic function, 
while the latter executes the more expected simple machine loads. Is there a 
subtle difference between the two calls to pointer?

Thanks all for your help!

On March 24, 2015 at 12:19:00 PM, Matt Bauman (mbau...@gmail.com) wrote:

(The key is to ensure that the method gets specialized for different types with 
the parametric `::Type{T}` in the signature instead of `T::DataType`).

On Tuesday, March 24, 2015 at 12:10:59 PM UTC-4, Stefan Karpinski wrote:
This seems like it works fine to me (on both 0.3 and 0.4):

immutable Test
x::Float32
y::Int64
z::Int8
end

julia a = [Test(1,2,3)]
1-element Array{Test,1}:
 Test(1.0f0,2,3)

julia b = copy(reinterpret(UInt8, a))
24-element Array{UInt8,1}:
 0x00
 0x00
 0x80
 0x3f
 0x03
 0x00
 0x00
 0x00
 0x02
 0x00
 0x00
 0x00
 0x00
 0x00
 0x00
 0x00
 0x03
 0xe0
 0x82
 0x10
 0x01
 0x00
 0x00
 0x00

julia prim_read{T}(::Type{T}, data::Array{Uint8,1}, offset::Int) = 
unsafe_load(convert(Ptr{T}, pointer(data) + offset))
prim_read (generic function with 1 method)

julia prim_read(Test, b, 0)
Test(1.0f0,2,3)

julia @code_native prim_read(Test, b, 0)
.section __TEXT,__text,regular,pure_instructions
Filename: none
Source line: 1
push RBP
mov RBP, RSP
Source line: 1
mov RCX, QWORD PTR [RSI + 8]
vmovss XMM0, DWORD PTR [RCX + RDX]
mov RAX, QWORD PTR [RCX + RDX + 8]
mov DL, BYTE PTR [RCX + RDX + 16]
pop RBP
ret


On Tue, Mar 24, 2015 at 5:04 PM, Simon Danisch sdan...@gmail.com wrote:
There is a high chance that I simply don't understand llvmcall well enough, 
though ;)

Am Montag, 23. März 2015 20:20:09 UTC+1 schrieb Sebastian Good:
I'm trying to read some binary formatted data. In C, I would define an 
appropriately padded struct and cast away. Is is possible to do something 
similar in Julia, though for only one value at a time? Philosophically, I'd 
like to approximate the following, for some simple bittypes T (Int32, Float32, 
etc.)

T readT(char* data, size_t offset) { return *(T*)(data + offset); }

The transliteration of this brain-dead approach results in the following, which 
seems to allocate a boxed Pointer object on every invocation. The pointer 
function comes with ample warnings about how it shouldn't be used, and I 
imagine that it's not polite to the garbage collector.


prim_read{T}(::Type{T},
data::AbstractArray{Uint8,  
1},
byte_number)  
=  
unsafe_load

Re: [julia-users] Re: zero-allocation reinterpretation of bytes

2015-03-25 Thread Sebastian Good
Ah, I see it’s been discussed and even documented. FWIW, documenting this 
behavior in the pointer function would be useful for newbies like myself. I 
agree with Stefan that the two argument pointer function should be deprecated 
as it’s C-like behavior is inconsistent. If Julia pointer arithmetic is byte 
based, that’s a reasonable convention that just needs to be understood, like 
1-based indexing or FORTRAN array layout.

Sprinkling a few sizeof(T) in your code when you’re mucking about with pointers 
anyway is a small price to pay. With C conventions, you’d do just as much 
mucking about with convert(Ptr{UInt8},...).

On March 25, 2015 at 11:05:00 AM, Milan Bouchet-Valat (nalimi...@club.fr) wrote:

Le mercredi 25 mars 2015 à 07:55 -0700, Matt Bauman a écrit :
See https://github.com/JuliaLang/julia/issues/6219#issuecomment-38117402
This looks like a case where, as discussed for string indexing, writing 
something like p + 5bytes could make sense. Then the default behavior could 
follow the more natural C convention, yet you'd never have to write things like 
p + size/sizeof(T) (to quote Jeff's remark on the issue).


Regards

On Wednesday, March 25, 2015 at 9:58:46 AM UTC-4, Sebastian Good wrote:
The benefit of the semantics of the two argument pointer function is that it 
preserves intuitive pointer arithmetic. As a new (yet happy!) Julia programmer, 
I certainly don’t know what the deprecation implications of changing pointer 
arithmetic are (vast, sadly, I imagine), but their behavior certainly violated 
my “principle of least astonishment” when I found they worked by bytes, not by 
Ts. That is, instead of base/pointer.jl:64 (and friends) looking like


+(x::Ptr, y::Integer) = oftype(x, (UInt(x) + (y % UInt) % UInt))


I would expect them to look like


+{T}(x::Ptr{T}, y::Integer) = oftype(x, (UInt(x) + sizeof(T)*(y % UInt) % UInt))


To more closely follow the principle of pointer arithmetic long ago established 
by C. The type specialization would make these just as fast. For this to work 
with arrays safely, you’d have to guarantee that dense arrays had no padding 
between elements. Since C requires this to the be the case, it seems we’re on 
safe ground?
On March 25, 2015 at 9:07:40 AM, Stefan Karpinski (ste...@karpinski.org) wrote:


Given the performance difference and the different behavior, I'm tempted to 
just deprecate the two-argument form of pointer.

On Wed, Mar 25, 2015 at 12:53 PM, Sebastian Good 
seba...@palladiumconsulting.com wrote:
I guess what I find most confusing is that there would be a difference, since 
adding 1 to a pointer only adds one byte, not one element size.


 p1 = pointer(zeros(UInt64));
Ptr{UInt64} @0x00010b28c360
 p1 + 1
Ptr{UInt64} @0x00010b28c361


I would have expected the latter to end in 68. the two argument pointer 
function gets this “right”. 


 a=zeros(UInt64);
 pointer(a,1)
Ptr{Int64} @0x00010b9c72e0
 pointer(a,2)
Ptr{Int64} @0x00010b9c72e8


I can see arguments multiple ways, but when I’m given a strongly typed pointer 
(Ptr{T}), I would expect it to participate in arithmetic in increments of 
sizeof(T).

On March 25, 2015 at 6:36:37 AM, Stefan Karpinski (ste...@karpinski.org) wrote:

That does seem to be the issue. It's tricky to fix since you can't evaluate 
sizeof(Ptr) unless the condition is true.

On Tue, Mar 24, 2015 at 7:13 PM, Stefan Karpinski ste...@karpinski.org wrote:
There's a branch in eltype, which is probably causing this difference.

On Tue, Mar 24, 2015 at 7:00 PM, Sebastian Good 
seba...@palladiumconsulting.com wrote:
Yep, that’s done it. The only difference I can see in the code I wrote before 
and this code is that previously I had


convert(Ptr{T}, pointer(raw, byte_number))


whereas here we have


convert(Ptr{T}, pointer(raw) + byte_number - 1)

The former construction seems to emit a call to a Julia-intrinsic function, 
while the latter executes the more expected simple machine loads. Is there a 
subtle difference between the two calls to pointer?

Thanks all for your help!

On March 24, 2015 at 12:19:00 PM, Matt Bauman (mba...@gmail.com) wrote:

(The key is to ensure that the method gets specialized for different types with 
the parametric `::Type{T}` in the signature instead of `T::DataType`).

On Tuesday, March 24, 2015 at 12:10:59 PM UTC-4, Stefan Karpinski wrote:
This seems like it works fine to me (on both 0.3 and 0.4):


immutable Test
x::Float32
y::Int64
z::Int8
end


julia a = [Test(1,2,3)]
1-element Array{Test,1}:
 Test(1.0f0,2,3)


julia b = copy(reinterpret(UInt8, a))
24-element Array{UInt8,1}:
 0x00
 0x00
 0x80
 0x3f
 0x03
 0x00
 0x00
 0x00
 0x02
 0x00
 0x00
 0x00
 0x00
 0x00
 0x00
 0x00
 0x03
 0xe0
 0x82
 0x10
 0x01
 0x00
 0x00
 0x00


julia prim_read{T}(::Type{T}, data::Array{Uint8,1}, offset::Int) = 
unsafe_load(convert(Ptr{T}, pointer(data) + offset))
prim_read (generic function with 1 method)


julia prim_read(Test, b, 0)
Test(1.0f0,2,3)


julia @code_native prim_read(Test, b, 0)
.section

[julia-users] SubArray memory footprint

2015-03-25 Thread Sebastian Good
I was surprised by two things in the SubArray implementation

1) They are big! About 175 bytes for a simple subset from a 1D array from 
my naive measurement.[*]
2) They are not flat. That is, they seem to get heap allocated and have 
indirections in them.

I'm guessing this is because SubArrays aren't immutable, and tuples aren't 
always inlined into an immutable either, but I am really grasping at straws.

I'm walking through a very large memory mapped structure and generating 
hundreds of thousands of subarrays to look at various windows of it. I was 
hoping that by using views I would reduce memory usage as compared with 
creating copies of those windows. Indeed I am, but by a lot less than I 
thought I would be. 

In other words: SubArrays are surprisingly expensive because they 
necessitate several memory allocations apiece.

From the work that's gone into SubArrays I'm guessing that isn't meant to 
be. They are so carefully specialized that I would expect them to behave 
roughly like a (largish) struct in common use.

Is this a misconception? Do I need to take more care about how I 
parameterize the container I put them in to take advantage?

[*]
 const b = [1:5;]
 function f()
  for i in 1:1_000_000 sub(b, 1:2) end
end
 @time f()
elapsed time: 0.071933306 seconds (175 MB allocated, 9.21% gc time in 8 
pauses with 0 full sweep)


Re: [julia-users] SubArray memory footprint

2015-03-25 Thread Sebastian Good
That helps a bit; I am indeed working on v0.4. A zero-allocation SubArray would 
be a phenomenal achievement. I guess it’s at that point that getindex with 
ranges will return SubArrays, i.e. mutable views, instead of copies? Is that 
still targeted for v0.4?

On March 25, 2015 at 3:30:03 PM, Tim Holy (tim.h...@gmail.com) wrote:

SubArrays are immutable on 0.4. But tuples aren't inlined, which is going to  
force allocation.  

Assuming you're using 0.3, there's a second problem: the code in the  
constructor is not type-stable, and that makes construction slow and memory-  
hungry. Compare the following on 0.3 and 0.4:  

julia A = rand(2,10^4);  

julia function myfun(A)  
s = 0.0  
for j = 1:size(A,2)  
S = slice(A, :, j)  
s += sum(S)  
end  
s  
end  
myfun (generic function with 1 method)  


On 0.3:  
# warmup call  
julia @time myfun(A)  
elapsed time: 0.145141435 seconds (11277536 bytes allocated)  

# the real call  
julia @time myfun(A)  
elapsed time: 0.034556106 seconds (7866896 bytes allocated)  


On 0.4:  
julia @time myfun(A)  
elapsed time: 0.190744146 seconds (7 MB allocated)  

julia @time myfun(A)  
elapsed time: 0.000697173 seconds (1 MB allocated)  



So you can see it's about 50x faster and about 8-fold more memory efficient on  
0.4. Once Jeff finishes his tuple overhaul, the allocation on 0.4 could  
potentially drop to 0.  

--Tim  


On Wednesday, March 25, 2015 11:18:08 AM Sebastian Good wrote:  
 I was surprised by two things in the SubArray implementation  
  
 1) They are big! About 175 bytes for a simple subset from a 1D array from  
 my naive measurement.[*]  
 2) They are not flat. That is, they seem to get heap allocated and have  
 indirections in them.  
  
 I'm guessing this is because SubArrays aren't immutable, and tuples aren't  
 always inlined into an immutable either, but I am really grasping at straws.  
  
 I'm walking through a very large memory mapped structure and generating  
 hundreds of thousands of subarrays to look at various windows of it. I was  
 hoping that by using views I would reduce memory usage as compared with  
 creating copies of those windows. Indeed I am, but by a lot less than I  
 thought I would be.  
  
 In other words: SubArrays are surprisingly expensive because they  
 necessitate several memory allocations apiece.  
  
 From the work that's gone into SubArrays I'm guessing that isn't meant to  
 be. They are so carefully specialized that I would expect them to behave  
 roughly like a (largish) struct in common use.  
  
 Is this a misconception? Do I need to take more care about how I  
 parameterize the container I put them in to take advantage?  
  
 [*]  
  
  const b = [1:5;]  
  function f()  
  
 for i in 1:1_000_000 sub(b, 1:2) end  
 end  
  
  @time f()  
  
 elapsed time: 0.071933306 seconds (175 MB allocated, 9.21% gc time in 8  
 pauses with 0 full sweep)  



Re: [julia-users] Re: zero-allocation reinterpretation of bytes

2015-03-25 Thread Sebastian Good
I guess what I find most confusing is that there would be a difference, since 
adding 1 to a pointer only adds one byte, not one element size.

 p1 = pointer(zeros(UInt64));
Ptr{UInt64} @0x00010b28c360
 p1 + 1
Ptr{UInt64} @0x00010b28c361

I would have expected the latter to end in 68. the two argument pointer 
function gets this “right”. 

 a=zeros(UInt64);
 pointer(a,1)
Ptr{Int64} @0x00010b9c72e0
 pointer(a,2)
Ptr{Int64} @0x00010b9c72e8

I can see arguments multiple ways, but when I’m given a strongly typed pointer 
(Ptr{T}), I would expect it to participate in arithmetic in increments of 
sizeof(T).

On March 25, 2015 at 6:36:37 AM, Stefan Karpinski (ste...@karpinski.org) wrote:

That does seem to be the issue. It's tricky to fix since you can't evaluate 
sizeof(Ptr) unless the condition is true.

On Tue, Mar 24, 2015 at 7:13 PM, Stefan Karpinski ste...@karpinski.org wrote:
There's a branch in eltype, which is probably causing this difference.

On Tue, Mar 24, 2015 at 7:00 PM, Sebastian Good 
sebast...@palladiumconsulting.com wrote:
Yep, that’s done it. The only difference I can see in the code I wrote before 
and this code is that previously I had

convert(Ptr{T}, pointer(raw, byte_number))

whereas here we have

convert(Ptr{T}, pointer(raw) + byte_number - 1)

The former construction seems to emit a call to a Julia-intrinsic function, 
while the latter executes the more expected simple machine loads. Is there a 
subtle difference between the two calls to pointer?

Thanks all for your help!

On March 24, 2015 at 12:19:00 PM, Matt Bauman (mbau...@gmail.com) wrote:

(The key is to ensure that the method gets specialized for different types with 
the parametric `::Type{T}` in the signature instead of `T::DataType`).

On Tuesday, March 24, 2015 at 12:10:59 PM UTC-4, Stefan Karpinski wrote:
This seems like it works fine to me (on both 0.3 and 0.4):

immutable Test
x::Float32
y::Int64
z::Int8
end

julia a = [Test(1,2,3)]
1-element Array{Test,1}:
 Test(1.0f0,2,3)

julia b = copy(reinterpret(UInt8, a))
24-element Array{UInt8,1}:
 0x00
 0x00
 0x80
 0x3f
 0x03
 0x00
 0x00
 0x00
 0x02
 0x00
 0x00
 0x00
 0x00
 0x00
 0x00
 0x00
 0x03
 0xe0
 0x82
 0x10
 0x01
 0x00
 0x00
 0x00

julia prim_read{T}(::Type{T}, data::Array{Uint8,1}, offset::Int) = 
unsafe_load(convert(Ptr{T}, pointer(data) + offset))
prim_read (generic function with 1 method)

julia prim_read(Test, b, 0)
Test(1.0f0,2,3)

julia @code_native prim_read(Test, b, 0)
.section __TEXT,__text,regular,pure_instructions
Filename: none
Source line: 1
push RBP
mov RBP, RSP
Source line: 1
mov RCX, QWORD PTR [RSI + 8]
vmovss XMM0, DWORD PTR [RCX + RDX]
mov RAX, QWORD PTR [RCX + RDX + 8]
mov DL, BYTE PTR [RCX + RDX + 16]
pop RBP
ret


On Tue, Mar 24, 2015 at 5:04 PM, Simon Danisch sdan...@gmail.com wrote:
There is a high chance that I simply don't understand llvmcall well enough, 
though ;)

Am Montag, 23. März 2015 20:20:09 UTC+1 schrieb Sebastian Good:
I'm trying to read some binary formatted data. In C, I would define an 
appropriately padded struct and cast away. Is is possible to do something 
similar in Julia, though for only one value at a time? Philosophically, I'd 
like to approximate the following, for some simple bittypes T (Int32, Float32, 
etc.)

T readT(char* data, size_t offset) { return *(T*)(data + offset); }

The transliteration of this brain-dead approach results in the following, which 
seems to allocate a boxed Pointer object on every invocation. The pointer 
function comes with ample warnings about how it shouldn't be used, and I 
imagine that it's not polite to the garbage collector.


prim_read{T}(::Type{T},
data::AbstractArray{Uint8,  
1},
byte_number)  
=  
unsafe_load(convert(Ptr{T},  
pointer(data,
byte_number)))

I can reinterpret the whole array, but this will involve a division of the 
offset to calculate the new offset relative to the reinterpreted array, and it 
allocates an array object. 

Is there a better way to simply read the machine word at a particular offset in 
a byte array? I would think it should inline to a single assembly instruction 
if done right.
    





Re: [julia-users] Re: zero-allocation reinterpretation of bytes

2015-03-24 Thread Sebastian Good
Thanks Simon,

I’ve tried this approach, but the disassembly still indicates it calls a 
function julia_pointer, which is overhead I’d like to avoid if possible. 
On March 23, 2015 at 6:11:08 PM, Simon Danisch (sdani...@gmail.com) wrote:

unsafe_read_t(T::DataType, x::Vector{Uint8}, byte_offset::Integer) = 
unsafe_load(Ptr{T}(pointer(x, byte_offset)), 1)

[julia-users] When are files un mmap'd?

2015-03-24 Thread Sebastian Good
Given

s = open(bigdata)
a = mmap_array(Int64, (a_billion, a_jillion), s)

Is it safe to use a after s has been closed? The documentation isn't 
entirely clear about when munmap is called; is there an explicit operation 
we should use on a to deterministically unmap it?





Re: [julia-users] Re: zero-allocation reinterpretation of bytes

2015-03-24 Thread Sebastian Good
Yep, that’s done it. The only difference I can see in the code I wrote before 
and this code is that previously I had

convert(Ptr{T}, pointer(raw, byte_number))

whereas here we have

convert(Ptr{T}, pointer(raw) + byte_number - 1)

The former construction seems to emit a call to a Julia-intrinsic function, 
while the latter executes the more expected simple machine loads. Is there a 
subtle difference between the two calls to pointer?

Thanks all for your help!

On March 24, 2015 at 12:19:00 PM, Matt Bauman (mbau...@gmail.com) wrote:

(The key is to ensure that the method gets specialized for different types with 
the parametric `::Type{T}` in the signature instead of `T::DataType`).

On Tuesday, March 24, 2015 at 12:10:59 PM UTC-4, Stefan Karpinski wrote:
This seems like it works fine to me (on both 0.3 and 0.4):

immutable Test
x::Float32
y::Int64
z::Int8
end

julia a = [Test(1,2,3)]
1-element Array{Test,1}:
 Test(1.0f0,2,3)

julia b = copy(reinterpret(UInt8, a))
24-element Array{UInt8,1}:
 0x00
 0x00
 0x80
 0x3f
 0x03
 0x00
 0x00
 0x00
 0x02
 0x00
 0x00
 0x00
 0x00
 0x00
 0x00
 0x00
 0x03
 0xe0
 0x82
 0x10
 0x01
 0x00
 0x00
 0x00

julia prim_read{T}(::Type{T}, data::Array{Uint8,1}, offset::Int) = 
unsafe_load(convert(Ptr{T}, pointer(data) + offset))
prim_read (generic function with 1 method)

julia prim_read(Test, b, 0)
Test(1.0f0,2,3)

julia @code_native prim_read(Test, b, 0)
.section __TEXT,__text,regular,pure_instructions
Filename: none
Source line: 1
push RBP
mov RBP, RSP
Source line: 1
mov RCX, QWORD PTR [RSI + 8]
vmovss XMM0, DWORD PTR [RCX + RDX]
mov RAX, QWORD PTR [RCX + RDX + 8]
mov DL, BYTE PTR [RCX + RDX + 16]
pop RBP
ret


On Tue, Mar 24, 2015 at 5:04 PM, Simon Danisch sdan...@gmail.com wrote:
There is a high chance that I simply don't understand llvmcall well enough, 
though ;)

Am Montag, 23. März 2015 20:20:09 UTC+1 schrieb Sebastian Good:
I'm trying to read some binary formatted data. In C, I would define an 
appropriately padded struct and cast away. Is is possible to do something 
similar in Julia, though for only one value at a time? Philosophically, I'd 
like to approximate the following, for some simple bittypes T (Int32, Float32, 
etc.)

T readT(char* data, size_t offset) { return *(T*)(data + offset); }

The transliteration of this brain-dead approach results in the following, which 
seems to allocate a boxed Pointer object on every invocation. The pointer 
function comes with ample warnings about how it shouldn't be used, and I 
imagine that it's not polite to the garbage collector.


prim_read{T}(::Type{T},
data::AbstractArray{Uint8,  
1},
byte_number)  
=  
unsafe_load(convert(Ptr{T},  
pointer(data,
byte_number)))

I can reinterpret the whole array, but this will involve a division of the 
offset to calculate the new offset relative to the reinterpreted array, and it 
allocates an array object. 

Is there a better way to simply read the machine word at a particular offset in 
a byte array? I would think it should inline to a single assembly instruction 
if done right.
    



[julia-users] zero-allocation reinterpretation of bytes

2015-03-23 Thread Sebastian Good
I'm trying to read some binary formatted data. In C, I would define an 
appropriately padded struct and cast away. Is is possible to do something 
similar in Julia, though for only one value at a time? Philosophically, I'd 
like to approximate the following, for some simple bittypes T (Int32, 
Float32, etc.)

T readT(char* data, size_t offset) { return *(T*)(data + offset); }

The transliteration of this brain-dead approach results in the following, 
which seems to allocate a boxed Pointer object on every invocation. The 
pointer function comes with ample warnings about how it shouldn't be used, 
and I imagine that it's not polite to the garbage collector.

prim_read{T}(::Type{T}, data::AbstractArray{Uint8, 1}, byte_number) = 
unsafe_load(convert(Ptr{T}, pointer(data, byte_number)))

I can reinterpret the whole array, but this will involve a division of the 
offset to calculate the new offset relative to the reinterpreted array, and 
it allocates an array object. 

Is there a better way to simply read the machine word at a particular 
offset in a byte array? I would think it should inline to a single assembly 
instruction if done right.



Re: [julia-users] Re: zero-allocation reinterpretation of bytes

2015-03-23 Thread Sebastian Good
It’s in memory — or in a memory-mapped file. How do I reinterpret at an offset 
in an array?

On March 23, 2015 at 5:03:47 PM, Tobias Knopp (tobias.kn...@googlemail.com) 
wrote:

Is the data in a file or already read into an array? If it is still in the file 
you can simply read the data using the read function. If the data is read as an 
Uint8 array you can use an immutable and reinterprete into it. This does 
however not work if the C-struct would contain fixed-size array data.

Cheers

Tobi



Am Montag, 23. März 2015 20:20:09 UTC+1 schrieb Sebastian Good:
I'm trying to read some binary formatted data. In C, I would define an 
appropriately padded struct and cast away. Is is possible to do something 
similar in Julia, though for only one value at a time? Philosophically, I'd 
like to approximate the following, for some simple bittypes T (Int32, Float32, 
etc.)

T readT(char* data, size_t offset) { return *(T*)(data + offset); }

The transliteration of this brain-dead approach results in the following, which 
seems to allocate a boxed Pointer object on every invocation. The pointer 
function comes with ample warnings about how it shouldn't be used, and I 
imagine that it's not polite to the garbage collector.


prim_read{T}(::Type{T},
data::AbstractArray{Uint8,  
1},
byte_number)  
=  
unsafe_load(convert(Ptr{T},  
pointer(data,
byte_number)))

I can reinterpret the whole array, but this will involve a division of the 
offset to calculate the new offset relative to the reinterpreted array, and it 
allocates an array object. 

Is there a better way to simply read the machine word at a particular offset in 
a byte array? I would think it should inline to a single assembly instruction 
if done right.
    

[julia-users] Re: Some simple use cases for multi-threading

2015-03-18 Thread Sebastian Good
Task stealing parallelism is an increasingly common use case and easy to 
program. e.g. Cilk, Grand Central Dispatch, 

On Thursday, March 12, 2015 at 11:52:37 PM UTC-4, Viral Shah wrote:

 I am looking to put together a set of use cases for our multi-threading 
 capabilities - mainly to push forward as well as a showcase. I am thinking 
 of starting with stuff in the microbenchmarks and the shootout 
 implementations that are already in test/perf. 

 I am looking for other ideas that would be of interest. If there is real 
 interest, we can collect all of these in a repo in JuliaParallel. 

 -viral 





[julia-users] Linking directly with LLVM generated bytecode

2015-03-18 Thread Sebastian Good
I've compiled a separate set of functions with another LLVM-based toolchain 
(in this case Intel's SPMD compiler, ispc). If I have it emit LLVM byte 
code (i.e. as a .bc file), is it possible to link that LLVM code directly 
into a Julia session? I can compile the library into a dynamic library, 
load it that way, and access it via ccall, but it strikes me that for some 
smaller functions, being able to directly link the LLVM in would enable not 
only a simpler work flow, but maybe even allow for inlining of smaller 
functions or better whole program optimization. I understand Cxx.jl is 
doing tricks vaguely of this nature. Where's the best place to look for 
attempting this sort of magic?


Re: [julia-users] Type matching in Julia version 0.4.0-dev+3434

2015-02-20 Thread Sebastian Good
Thanks for the hard work!

I got a surprising deprecation when writing an array with one term in it. 
While the expression [1] is valid (creating an Array{Int64,1} with one 
element in it), the expression [1:5] generates the deprecation warning, and 
I now have to write [1:5;]. This seems a little inelegant to me. If 1 and 
1:5 are both expressions, why does the latter require the ';'. (I'm 
guessing my premise that they're both expressions is wrong!)

On Thursday, February 19, 2015 at 10:11:12 AM UTC-5, Stefan Karpinski wrote:

 The NEWS.md file https://github.com/JuliaLang/julia/blob/master/NEWS.md 
 is a good place to look for changes:

 https://github.com/JuliaLang/julia/blob/master/NEWS.md#language-changes

 The new behavior of [ ] is the first item under language changes 
 https://github.com/JuliaLang/julia/blob/master/NEWS.md#language-changes 
 with links to the relevant issues and pull requests.

 The type matching change may well be a bug fix since Array{Any,1} and 
 Array{Array,1} are incomparable types (i.e. neither is a subtype of the 
 other, nor are they the equivalent). I'd have to see this spelled out a bit 
 more to know what change you're referring to.



 On Thu, Feb 19, 2015 at 5:49 AM, Robert DJ math@gmail.com 
 javascript: wrote:

 Hi,

 I have just updated Julia for the first time in 10 days and now I face 
 problems with old code:

 - The error WARNING: [a] concatenation is deprecated; use [a;] instead. 
 Easy to fix, but what is the reasoning behind adding the ;?

 - Type matching has changed: I have a function that takes arguments of 
 the type `Array{Array{T,N},1}` (output from `typeof`; in words, it is an 
 array where each element is an Array{Any,1} with multiple Array{Float,2}). 
 As type specification in the function, `Array{Any,1}` used to work, but 
 not anymore. 
 Specifying the type as `Array{Array{T,N},1}` with N being an appropriate 
 number doesn't work either. 
 Is there a solution to this?

 Thanks,

 Robert




[julia-users] Re: JuliaGPU, JuliaGeometry, JuliaGL

2015-02-18 Thread Sebastian Good
Forgive my ignorance, what is a gitter batch?

On Tuesday, February 17, 2015 at 12:12:05 PM UTC-5, Simon Danisch wrote:

 Hi all, 
 I've tried to vitalize some of the Julia groups that I'm part of.
 Groups: JuliaGPU https://github.com/JuliaGPU, JuliaGeometry 
 https://github.com/JuliaGeometry, JuliaGL https://github.com/JuliaGL/ 
 The biggest change is, that I've added some gitter batches, to easily 
 start chatting about the main topic in a more relaxed way.
 The results of these informal chats should be gathered in the meta 
 repository, which should be part of the group.
 Also, the meta repository can be used for gathering resources, sketching 
 out milestones, etc...
 Big thanks to everyone, that has been active lately, pushing the state of 
 the groups forward!

 If you just want to stay informed and add your two cents to some 
 discussions, please consider joining gitter.
 If you're interested in actively contributing, shoot me a line and I can 
 add you to the group.

 We had a discussion last year about these groups, where we concluded that 
 only mature projects should get into a julia group.
 As this pretty much resulted in the stagnation of the groups, rendering 
 them useless, I decided to add WIP and even empty projects to just get 
 things going.
 The state of Julias Geometry and OpenGL efforts are in an early stage of 
 development, so the groups are crucial for organizing the efforts, discuss 
 strategies and guarantee interoperability early on.

 Best,
 Simon



[julia-users] Re: trouble using Cxx.jl

2015-01-29 Thread Sebastian Good
I also run into issues compiling Cxx on OS X (10.10.2), I believe related 
to llvm-svn

llvm[8]: Compiling Host.mm for Release+Asserts build
In file included from 
/Users/sebastian/julia/deps/llvm-svn/tools/lldb/source/Host/macosx/Host.mm:68:
In file included from 
/System/Library/Frameworks/Foundation.framework/Headers/Foundation.h:10:
In file included from 
/System/Library/Frameworks/Foundation.framework/Headers/NSArray.h:7:
In file included from 
/System/Library/Frameworks/Foundation.framework/Headers/NSRange.h:5:
/System/Library/Frameworks/Foundation.framework/Headers/NSValue.h:12:1: 
error: 'objc_returns_inner_pointer'
  attribute only applies to methods
@property (readonly) const char *objCType NS_RETURNS_INNER_POINTER;
^ 


Which certainly looks like perhaps grabbing the wrong llvm. I am building 
from a julia 0.4 (master) according to the instructions here 
(https://github.com/Keno/Cxx.jl). Any suggestions on where to start?

On Thursday, November 27, 2014 at 11:24:25 AM UTC-6, Max Suster wrote:

 I am using Cxx.jl on Mac OSX 10.9.5 (64 bit). 
 Cxx requires Julia v0.4.0-dev (i.e., staged functions).  I have rebuilt 
 Cxx many times and the only trouble I encountered was due to not updating 
 Clang in deps/llvm-svn and related directories. 

 I think it is best to file an issue in Cxx.jl and hopefully someone can 
 help you with linux-specific issues. 



[julia-users] Why do array subscripts create copies instead of views?

2015-01-13 Thread Sebastian Good
Forgive what is probably and old discussion, but I am curious why array 
subscripts create copies of arrays instead of views into them. Given

julia a = [1:3 1:3];

I can mutate a whole column of the array in place by saying

julia a[:,2] = [4:6];
julia a
3x2 Array{Int64,2}:
 1  4
 2  5
 3  6

But my sense of referential transparency is violated because I can't 
further mutate that this way

julia a[:,2][1] = 10;
julia a
3x2 Array{Int64,2}:
 1  4
 2  5
 3  6

I can achieve mutable sub-views with the sub function

julia sub(a, :, 2)[1] = 10;
julia a
3x2 Array{Int64,2}:
 1  10
 2  5
 3  6

Why the difference? Why not make [] syntax sugar for sub? (As () will be 
for call in 0.4)


Re: [julia-users] link error in julia 0.4

2014-12-22 Thread Sebastian Good
Elliot -- thanks for finding the obvious problem! It appears I subjected
myself to some unnecessary tail chasing. I encountered some error
previously, then in mucking around with my gcc install managed to move my
gfortran executable out of my path. Classic own goal. Now it's back in
place, I did a clean-openblas, and waited a few minutes... and I have a
lovely julia prompt. Thanks. Insidious since even if you bork gcc, Apple's
clang will answer to 'gcc' as well, so the build doesn't fail until later
in a slightly non-obvious way.

*Sebastian Good*


On Mon, Dec 22, 2014 at 1:34 AM, Elliot Saba staticfl...@gmail.com wrote:

 What happens if you run `gfortran` from your shell?

 On Sun, Dec 21, 2014 at 8:50 PM, Sebastian Good 
 sebast...@palladiumconsulting.com wrote:

 Thanks for the hint. Unfortunately, rebuilding blas etc. results in some
 errors; the No such file or directory leave me worried. (There's no
 evidence there are problems with pthreads or SandyBridge, and in any case
 just trying the rebuild with the suggested options resulted in the same
 errors. Any ideas what file or directory gfortran might be looking for?

 ...
 setparam_HASWELL.c:677:7: warning: unused variable 'l2'
 [-Wunused-variable]
   int l2 = get_l2_size();
   ^
 1 warning generated.
 make[5]: gfortran: No such file or directory
 make[5]: gfortran: No such file or directory
 make[5]: gfortran: No such file or directory
 make[5]: gfortran: No such file or directory
 make[5]: *** [sgbcon.o] Error 1
 make[5]: *** Waiting for unfinished jobs
 make[5]: *** [sgbbrd.o] Error 1
 make[5]: *** [sgbequ.o] Error 1
 make[5]: *** [sgbrfs.o] Error 1
 make[4]: *** [lapacklib] Error 2
 make[3]: *** [netlib] Error 2
 *** Clean the OpenBLAS build with 'make -C deps clean-openblas'. Rebuild
 with 'make OPENBLAS_USE_THREAD=0 if OpenBLAS had trouble linking
 libpthread.so, and with 'make OPENBLAS_TARGET_ARCH=NEHALEM' if there were
 errors building SandyBridge support. Both these options can also be used
 simultaneously. ***
 make[2]: *** [openblas-v0.2.13/libopenblas.dylib] Error 1
 make[1]: *** [julia-release] Error 2
 make: *** [release] Error 2


 *Sebastian Good*


 On Sun, Dec 21, 2014 at 9:59 PM, Isaiah Norton isaiah.nor...@gmail.com
 wrote:

 Not sure if this applies to your setup, but see:
 https://github.com/JuliaLang/julia/issues/9407#issuecomment-67654090

 On Sun, Dec 21, 2014 at 10:39 PM, Sebastian Good 
 sebast...@palladiumconsulting.com wrote:

 For many months I've enjoyed followed along with the 0.4 development,
 but in the last week the build doesn't seem to be working on OS X Yosemite.
 I get what appear to be a bunch of lapack link errors. Whether compiling
 with gcc or clang the errors seem to be the same; is there some new tooling
 or build steps that need to be in place?

 Making install in .
  ./install-sh -c -d '/Users/sebastian/julia/usr/lib'
  /bin/sh ./libtool   --mode=install /usr/bin/install -c   libarpack.la
 '/Users/sebastian/julia/usr/lib'
 libtool: install: /usr/bin/install -c .libs/libarpack.2.dylib
 /Users/sebastian/julia/usr/lib/libarpack.2.dylib
 libtool: install: (cd /Users/sebastian/julia/usr/lib  { ln -s -f
 libarpack.2.dylib libarpack.dylib || { rm -f libarpack.dylib  ln -s
 libarpack.2.dylib libarpack.dylib; }; })
 libtool: install: /usr/bin/install -c .libs/libarpack.lai
 /Users/sebastian/julia/usr/lib/libarpack.la
 libtool: install: /usr/bin/install -c .libs/libarpack.a
 /Users/sebastian/julia/usr/lib/libarpack.a
 libtool: install: chmod 644 /Users/sebastian/julia/usr/lib/libarpack.a
 libtool: install: ranlib /Users/sebastian/julia/usr/lib/libarpack.a
  ./install-sh -c -d '/Users/sebastian/julia/usr/lib/pkgconfig'
  /usr/bin/install -c -m 644 arpack.pc
 '/Users/sebastian/julia/usr/lib/pkgconfig'
 Making install in TESTS
 Making install in EXAMPLES
 Making install in BAND
 Making install in COMPLEX
 Making install in NONSYM
 Making install in SIMPLE
 Making install in SVD
 Making install in SYM
 Undefined symbols for architecture x86_64:
   _dgemm_, referenced from:
   _cholmod_super_numeric in libcholmod.a(cholmod_super_numeric.o)
   _cholmod_super_lsolve in libcholmod.a(cholmod_super_solve.o)
   _cholmod_super_ltsolve in libcholmod.a(cholmod_super_solve.o)
   _cholmod_l_super_numeric in
 libcholmod.a(cholmod_l_super_numeric.o)
   _cholmod_l_super_lsolve in libcholmod.a(cholmod_l_super_solve.o)
   _cholmod_l_super_ltsolve in libcholmod.a(cholmod_l_super_solve.o)
   _dgemv_, referenced from:
 (etc)







[julia-users] link error in julia 0.4

2014-12-21 Thread Sebastian Good
For many months I've enjoyed followed along with the 0.4 development, but 
in the last week the build doesn't seem to be working on OS X Yosemite. I 
get what appear to be a bunch of lapack link errors. Whether compiling with 
gcc or clang the errors seem to be the same; is there some new tooling or 
build steps that need to be in place?

Making install in .
 ./install-sh -c -d '/Users/sebastian/julia/usr/lib'
 /bin/sh ./libtool   --mode=install /usr/bin/install -c   libarpack.la 
'/Users/sebastian/julia/usr/lib'
libtool: install: /usr/bin/install -c .libs/libarpack.2.dylib 
/Users/sebastian/julia/usr/lib/libarpack.2.dylib
libtool: install: (cd /Users/sebastian/julia/usr/lib  { ln -s -f 
libarpack.2.dylib libarpack.dylib || { rm -f libarpack.dylib  ln -s 
libarpack.2.dylib libarpack.dylib; }; })
libtool: install: /usr/bin/install -c .libs/libarpack.lai 
/Users/sebastian/julia/usr/lib/libarpack.la
libtool: install: /usr/bin/install -c .libs/libarpack.a 
/Users/sebastian/julia/usr/lib/libarpack.a
libtool: install: chmod 644 /Users/sebastian/julia/usr/lib/libarpack.a
libtool: install: ranlib /Users/sebastian/julia/usr/lib/libarpack.a
 ./install-sh -c -d '/Users/sebastian/julia/usr/lib/pkgconfig'
 /usr/bin/install -c -m 644 arpack.pc 
'/Users/sebastian/julia/usr/lib/pkgconfig'
Making install in TESTS
Making install in EXAMPLES
Making install in BAND
Making install in COMPLEX
Making install in NONSYM
Making install in SIMPLE
Making install in SVD
Making install in SYM
Undefined symbols for architecture x86_64:
  _dgemm_, referenced from:
  _cholmod_super_numeric in libcholmod.a(cholmod_super_numeric.o)
  _cholmod_super_lsolve in libcholmod.a(cholmod_super_solve.o)
  _cholmod_super_ltsolve in libcholmod.a(cholmod_super_solve.o)
  _cholmod_l_super_numeric in libcholmod.a(cholmod_l_super_numeric.o)
  _cholmod_l_super_lsolve in libcholmod.a(cholmod_l_super_solve.o)
  _cholmod_l_super_ltsolve in libcholmod.a(cholmod_l_super_solve.o)
  _dgemv_, referenced from:
(etc)


Re: [julia-users] link error in julia 0.4

2014-12-21 Thread Sebastian Good
Thanks for the hint. Unfortunately, rebuilding blas etc. results in some
errors; the No such file or directory leave me worried. (There's no
evidence there are problems with pthreads or SandyBridge, and in any case
just trying the rebuild with the suggested options resulted in the same
errors. Any ideas what file or directory gfortran might be looking for?

...
setparam_HASWELL.c:677:7: warning: unused variable 'l2' [-Wunused-variable]
  int l2 = get_l2_size();
  ^
1 warning generated.
make[5]: gfortran: No such file or directory
make[5]: gfortran: No such file or directory
make[5]: gfortran: No such file or directory
make[5]: gfortran: No such file or directory
make[5]: *** [sgbcon.o] Error 1
make[5]: *** Waiting for unfinished jobs
make[5]: *** [sgbbrd.o] Error 1
make[5]: *** [sgbequ.o] Error 1
make[5]: *** [sgbrfs.o] Error 1
make[4]: *** [lapacklib] Error 2
make[3]: *** [netlib] Error 2
*** Clean the OpenBLAS build with 'make -C deps clean-openblas'. Rebuild
with 'make OPENBLAS_USE_THREAD=0 if OpenBLAS had trouble linking
libpthread.so, and with 'make OPENBLAS_TARGET_ARCH=NEHALEM' if there were
errors building SandyBridge support. Both these options can also be used
simultaneously. ***
make[2]: *** [openblas-v0.2.13/libopenblas.dylib] Error 1
make[1]: *** [julia-release] Error 2
make: *** [release] Error 2


*Sebastian Good*


On Sun, Dec 21, 2014 at 9:59 PM, Isaiah Norton isaiah.nor...@gmail.com
wrote:

 Not sure if this applies to your setup, but see:
 https://github.com/JuliaLang/julia/issues/9407#issuecomment-67654090

 On Sun, Dec 21, 2014 at 10:39 PM, Sebastian Good 
 sebast...@palladiumconsulting.com wrote:

 For many months I've enjoyed followed along with the 0.4 development, but
 in the last week the build doesn't seem to be working on OS X Yosemite. I
 get what appear to be a bunch of lapack link errors. Whether compiling with
 gcc or clang the errors seem to be the same; is there some new tooling or
 build steps that need to be in place?

 Making install in .
  ./install-sh -c -d '/Users/sebastian/julia/usr/lib'
  /bin/sh ./libtool   --mode=install /usr/bin/install -c   libarpack.la
 '/Users/sebastian/julia/usr/lib'
 libtool: install: /usr/bin/install -c .libs/libarpack.2.dylib
 /Users/sebastian/julia/usr/lib/libarpack.2.dylib
 libtool: install: (cd /Users/sebastian/julia/usr/lib  { ln -s -f
 libarpack.2.dylib libarpack.dylib || { rm -f libarpack.dylib  ln -s
 libarpack.2.dylib libarpack.dylib; }; })
 libtool: install: /usr/bin/install -c .libs/libarpack.lai
 /Users/sebastian/julia/usr/lib/libarpack.la
 libtool: install: /usr/bin/install -c .libs/libarpack.a
 /Users/sebastian/julia/usr/lib/libarpack.a
 libtool: install: chmod 644 /Users/sebastian/julia/usr/lib/libarpack.a
 libtool: install: ranlib /Users/sebastian/julia/usr/lib/libarpack.a
  ./install-sh -c -d '/Users/sebastian/julia/usr/lib/pkgconfig'
  /usr/bin/install -c -m 644 arpack.pc
 '/Users/sebastian/julia/usr/lib/pkgconfig'
 Making install in TESTS
 Making install in EXAMPLES
 Making install in BAND
 Making install in COMPLEX
 Making install in NONSYM
 Making install in SIMPLE
 Making install in SVD
 Making install in SYM
 Undefined symbols for architecture x86_64:
   _dgemm_, referenced from:
   _cholmod_super_numeric in libcholmod.a(cholmod_super_numeric.o)
   _cholmod_super_lsolve in libcholmod.a(cholmod_super_solve.o)
   _cholmod_super_ltsolve in libcholmod.a(cholmod_super_solve.o)
   _cholmod_l_super_numeric in libcholmod.a(cholmod_l_super_numeric.o)
   _cholmod_l_super_lsolve in libcholmod.a(cholmod_l_super_solve.o)
   _cholmod_l_super_ltsolve in libcholmod.a(cholmod_l_super_solve.o)
   _dgemv_, referenced from:
 (etc)





[julia-users] Typeclass implementation

2014-11-21 Thread Sebastian Good
In implementing new kinds of numbers, I've found it difficult to know just 
how many functions I need to implement for the general library to just 
work on them. Take as an example a byte-swapped, e.g. big-endian, integer. 
This is handy when doing memory-mapped I/O on a file with data written in 
network order. It would be nice to just implement, say, Int32BigEndian and 
have it act like a real number. (Then I could just reinterpret a mmaped 
array and work directly off it) In general, we'd convert to Int32 at the 
earliest opportunity we had. For instance the following macro introduces a 
new type which claims to be derived from $base_type, and implements 
conversions and promotion rules to get it into a native form ($n_type) 
whenever it's used.

macro encoded_bitstype(name, base_type, bits_type, n_type, to_n, from_n)
quote
immutable $name : $base_type
bits::$bits_type
end

Base.bits(x::$name) = bits(x.bits)
Base.bswap(x::$name) = $name(bswap(x.bits))

Base.convert(::Type{$n_type}, x::$name) = $to_n(x.bits)
Base.convert(::Type{$name}, x::$n_type) = $name($from_n(x))
Base.promote_rule(::Type{$name}, ::Type{$n_type}) = $n_type
Base.promote_rule(::Type{$name}, ::Type{$base_type}) = $n_type
end
end

I can use it like this

@encoded_bitstype(Int32BigEndian, Signed, Int32, Int32, bswap, bswap)

But unfortunately, it doesn't work out of the box because the conversions 
need to be explicit. I noticed that many of the math functions promote 
their arguments to a common type, but the following trick doesn't work, 
presumably because the promotion algorithm doesn't ask to promote types 
that are already identical.

Base.promote_rule(::Type{$name}, ::Type{$name}) = $n_type

It seems like there are a couple of issues this raises, and I know I've 
seen similar questions on this list as people implement new kinds of 
things, e.g. exotic matrices.

1. One possibility would be to allow an implicit promotion, perhaps 
expressed as the self-promotion above. I say I'm a Int32BigEndian, or 
CompressedVector, or what have you, and provide a way to turn me into an 
Int32 or Vector implicitly to take advantage of all the functions already 
written on those types. I'm not sure this is a great option for the 
language since it's been explicitly avoided elsewhere. but I'm curious if 
there have been any discussions in this direction

2. If instead I want to say this new type acts like an Integer, there's 
no canonical place for me to find out what all the functions are I need to 
implement. Ultimately, these are like Haskell's typeclasses, Ord, Eq, etc. 
By trial and error, we can determine many of them and implement them this 
way

macro as_number(name, n_type)
 quote
global +(x::$name, y::$name) = +(convert($n_type, x), 
convert($n_type, y))
global *(x::$name, y::$name) = *(convert($n_type, x), 
convert($n_type, y))
global -(x::$name, y::$name) = -(convert($n_type, x), 
convert($n_type, y))
global -(x::$name) = -convert($n_type, x)
global /(x::$name, y::$name) = /(convert($n_type, x), 
convert($n_type, y))
global ^(x::$name, y::$name) = ^(convert($n_type, x), 
convert($n_type, y))
global ==(x::$name, y::$name) = (==)(convert($n_type, x), 
convert($n_type, y))
global  (x::$name, y::$name) = ( )(convert($n_type, x), 
convert($n_type, y)) 
Base.flipsign(x::$name, y::$name) = Base.flipsign(convert($n_type, 
x), convert($n_type, y))
end
end

But I don't know if I've found them all, and my guesses may well change as 
implementation details inside the base library change. Gradual typing is 
great, but with such a powerful base library already in place, it would be 
good to have a facility to know which functions are associated with which 
named behaviors.

Since we already have abstract classes in place, e.g. Signed, Number, etc., 
it would be natural to extract a list of functions which operate on them, 
or, even better, allow the type declarer to specify which functions 
*should* operate on that abstract class, typeclass or interface style?

Are there any recommendations in place, or updates to the language planned, 
to address these sorts of topics?







Re: [julia-users] Typeclass implementation

2014-11-21 Thread Sebastian Good
I will look into Traits.jl -- interesting package.

To get traction and some of the great power of comparability, the base
library will need to be carefully decomposed into traits, which (as noted
in some of the issue conversations on github) takes you straight to the
great research Haskell is doing in this area.

*Sebastian Good*


On Fri, Nov 21, 2014 at 9:38 AM, John Myles White johnmyleswh...@gmail.com
wrote:

 This sounds a bit like a mix of two problems:

 (1) A lack of interfaces:

  - a) A lack of formal interfaces, which will hopefully be addressed by
 something like Traits.jl at some point. (
 https://github.com/JuliaLang/julia/issues/6975)

  - b) A lack of documentation for informal interfaces, such as the methods
 that AbstractArray objects must implement.

 (2) A lack of delegation when you make wrapper types:
 https://github.com/JuliaLang/julia/pull/3292

 The first has moved forward a bunch thanks to Mauro's work. The second has
 not gotten much further, although Kevin Squire wrote a different delegate
 macro that's noticeably better than the draft I wrote.

  -- John

 On Nov 21, 2014, at 2:31 PM, Sebastian Good 
 sebast...@palladiumconsulting.com wrote:

 In implementing new kinds of numbers, I've found it difficult to know just
 how many functions I need to implement for the general library to just
 work on them. Take as an example a byte-swapped, e.g. big-endian, integer.
 This is handy when doing memory-mapped I/O on a file with data written in
 network order. It would be nice to just implement, say, Int32BigEndian and
 have it act like a real number. (Then I could just reinterpret a mmaped
 array and work directly off it) In general, we'd convert to Int32 at the
 earliest opportunity we had. For instance the following macro introduces a
 new type which claims to be derived from $base_type, and implements
 conversions and promotion rules to get it into a native form ($n_type)
 whenever it's used.

 macro encoded_bitstype(name, base_type, bits_type, n_type, to_n, from_n)
 quote
 immutable $name : $base_type
 bits::$bits_type
 end

 Base.bits(x::$name) = bits(x.bits)
 Base.bswap(x::$name) = $name(bswap(x.bits))

 Base.convert(::Type{$n_type}, x::$name) = $to_n(x.bits)
 Base.convert(::Type{$name}, x::$n_type) = $name($from_n(x))
 Base.promote_rule(::Type{$name}, ::Type{$n_type}) = $n_type
 Base.promote_rule(::Type{$name}, ::Type{$base_type}) = $n_type
 end
 end

 I can use it like this

 @encoded_bitstype(Int32BigEndian, Signed, Int32, Int32, bswap, bswap)

 But unfortunately, it doesn't work out of the box because the conversions
 need to be explicit. I noticed that many of the math functions promote
 their arguments to a common type, but the following trick doesn't work,
 presumably because the promotion algorithm doesn't ask to promote types
 that are already identical.

 Base.promote_rule(::Type{$name}, ::Type{$name}) = $n_type

 It seems like there are a couple of issues this raises, and I know I've
 seen similar questions on this list as people implement new kinds of
 things, e.g. exotic matrices.

 1. One possibility would be to allow an implicit promotion, perhaps
 expressed as the self-promotion above. I say I'm a Int32BigEndian, or
 CompressedVector, or what have you, and provide a way to turn me into an
 Int32 or Vector implicitly to take advantage of all the functions already
 written on those types. I'm not sure this is a great option for the
 language since it's been explicitly avoided elsewhere. but I'm curious if
 there have been any discussions in this direction

 2. If instead I want to say this new type acts like an Integer, there's
 no canonical place for me to find out what all the functions are I need to
 implement. Ultimately, these are like Haskell's typeclasses, Ord, Eq, etc.
 By trial and error, we can determine many of them and implement them this
 way

 macro as_number(name, n_type)
  quote
 global +(x::$name, y::$name) = +(convert($n_type, x),
 convert($n_type, y))
 global *(x::$name, y::$name) = *(convert($n_type, x),
 convert($n_type, y))
 global -(x::$name, y::$name) = -(convert($n_type, x),
 convert($n_type, y))
 global -(x::$name) = -convert($n_type, x)
 global /(x::$name, y::$name) = /(convert($n_type, x),
 convert($n_type, y))
 global ^(x::$name, y::$name) = ^(convert($n_type, x),
 convert($n_type, y))
 global ==(x::$name, y::$name) = (==)(convert($n_type, x),
 convert($n_type, y))
 global  (x::$name, y::$name) = ( )(convert($n_type, x),
 convert($n_type, y))
 Base.flipsign(x::$name, y::$name) = Base.flipsign(convert($n_type,
 x), convert($n_type, y))
 end
 end

 But I don't know if I've found them all, and my guesses may well change as
 implementation details inside the base library change. Gradual typing is
 great, but with such a powerful base library already in place, it would

Re: [julia-users] Typeclass implementation

2014-11-21 Thread Sebastian Good
I'm not sure I understand the distinction you make. You declare a typeclass
by defining the functions needed to qualify for it, as well as default
implementations. e.g.

class  Eq a  where
  (==), (/=):: a - a - Bool
  x /= y=  not (x == y)

the typeclass 'Eq a' requires implementation of two functions, (==) and
(/=), of type a - a - Bool, which would look like (a,a) -- Bool in the
proposed Julia function type syntax). The (/=) function has a default
implementation in terms of the (==) function, though you could define your
own for your own type if it were an instance of this typeclass.


*Sebastian Good*


On Fri, Nov 21, 2014 at 2:11 PM, Mauro mauro...@runbox.com wrote:

 Sebastian, in Haskell, is there a way to get all functions which are
 constrained by one or several type classes?  I.e. which functions are
 provided by a type-class?  (as opposed to which functions need to be
 implemented to belong to a type-class)

 On Fri, 2014-11-21 at 16:54, Jiahao Chen jia...@mit.edu wrote:
  If instead I want to say this new type acts like an Integer, there's
 no
  canonical place for me to find out what all the functions are I need to
  implement.
 
  The closest thing we have now is methodswith(Integer)
  and methodswith(Integer, true) (the latter gives also all the methods
 that
  Integer inherits from its supertypes).
 
  Thanks,
 
  Jiahao Chen
  Staff Research Scientist
  MIT Computer Science and Artificial Intelligence Laboratory
 
  On Fri, Nov 21, 2014 at 9:54 AM, Sebastian Good 
  sebast...@palladiumconsulting.com wrote:
 
  I will look into Traits.jl -- interesting package.
 
  To get traction and some of the great power of comparability, the base
  library will need to be carefully decomposed into traits, which (as
 noted
  in some of the issue conversations on github) takes you straight to the
  great research Haskell is doing in this area.
 
  *Sebastian Good*
 
 
  On Fri, Nov 21, 2014 at 9:38 AM, John Myles White 
  johnmyleswh...@gmail.com wrote:
 
  This sounds a bit like a mix of two problems:
 
  (1) A lack of interfaces:
 
   - a) A lack of formal interfaces, which will hopefully be addressed by
  something like Traits.jl at some point. (
  https://github.com/JuliaLang/julia/issues/6975)
 
   - b) A lack of documentation for informal interfaces, such as the
  methods that AbstractArray objects must implement.
 
  (2) A lack of delegation when you make wrapper types:
  https://github.com/JuliaLang/julia/pull/3292
 
  The first has moved forward a bunch thanks to Mauro's work. The second
  has not gotten much further, although Kevin Squire wrote a different
  delegate macro that's noticeably better than the draft I wrote.
 
   -- John
 
  On Nov 21, 2014, at 2:31 PM, Sebastian Good 
  sebast...@palladiumconsulting.com wrote:
 
  In implementing new kinds of numbers, I've found it difficult to know
  just how many functions I need to implement for the general library to
  just work on them. Take as an example a byte-swapped, e.g.
 big-endian,
  integer. This is handy when doing memory-mapped I/O on a file with data
  written in network order. It would be nice to just implement, say,
  Int32BigEndian and have it act like a real number. (Then I could just
  reinterpret a mmaped array and work directly off it) In general, we'd
  convert to Int32 at the earliest opportunity we had. For instance the
  following macro introduces a new type which claims to be derived from
  $base_type, and implements conversions and promotion rules to get it
 into a
  native form ($n_type) whenever it's used.
 
  macro encoded_bitstype(name, base_type, bits_type, n_type, to_n,
 from_n)
  quote
  immutable $name : $base_type
  bits::$bits_type
  end
 
  Base.bits(x::$name) = bits(x.bits)
  Base.bswap(x::$name) = $name(bswap(x.bits))
 
  Base.convert(::Type{$n_type}, x::$name) = $to_n(x.bits)
  Base.convert(::Type{$name}, x::$n_type) = $name($from_n(x))
  Base.promote_rule(::Type{$name}, ::Type{$n_type}) = $n_type
  Base.promote_rule(::Type{$name}, ::Type{$base_type}) = $n_type
  end
  end
 
  I can use it like this
 
  @encoded_bitstype(Int32BigEndian, Signed, Int32, Int32, bswap, bswap)
 
  But unfortunately, it doesn't work out of the box because the
 conversions
  need to be explicit. I noticed that many of the math functions promote
  their arguments to a common type, but the following trick doesn't work,
  presumably because the promotion algorithm doesn't ask to promote types
  that are already identical.
 
  Base.promote_rule(::Type{$name}, ::Type{$name}) = $n_type
 
  It seems like there are a couple of issues this raises, and I know I've
  seen similar questions on this list as people implement new kinds of
  things, e.g. exotic matrices.
 
  1. One possibility would be to allow an implicit promotion, perhaps
  expressed as the self-promotion above. I say I'm a Int32BigEndian

Re: [julia-users] Typeclass implementation

2014-11-21 Thread Sebastian Good
Ah, that I'm not sure of. There is no run-time reflection in Haskell,
though I don't doubt that artifacts of frightful intelligence exist to do
what you ask. Hoogle is a good place to start for that sort of thing.

Though methodswith is of limited use for determine a minimal
implementation. For instance, in my example, I can avoid implementing abs
because I define flipsign. When defining an encoded floating point, tan
magically works, but atan doesn't.

*Sebastian Good*


On Fri, Nov 21, 2014 at 2:47 PM, Mauro mauro...@runbox.com wrote:

 Yep, defining == is needed to implement Eq.  But then is there a way to
 query what functions are constrained by Eq?  For instance, give me a
 list of all functions which Eq provides, i.e. with type: Eq(a) = ...

 This would be similar to methodswith in Julia, although methodswith
 returns both the implementation functions as well as the provided
 functions.  Anyway, I was just wondering.

 On Fri, 2014-11-21 at 20:21, Sebastian Good 
 sebast...@palladiumconsulting.com wrote:
  I'm not sure I understand the distinction you make. You declare a
 typeclass
  by defining the functions needed to qualify for it, as well as default
  implementations. e.g.
 
  class  Eq a  where
(==), (/=):: a - a - Bool
x /= y=  not (x == y)
 
  the typeclass 'Eq a' requires implementation of two functions, (==) and
  (/=), of type a - a - Bool, which would look like (a,a) -- Bool in the
  proposed Julia function type syntax). The (/=) function has a default
  implementation in terms of the (==) function, though you could define
 your
  own for your own type if it were an instance of this typeclass.
 
 
  *Sebastian Good*
 
 
  On Fri, Nov 21, 2014 at 2:11 PM, Mauro mauro...@runbox.com wrote:
 
  Sebastian, in Haskell, is there a way to get all functions which are
  constrained by one or several type classes?  I.e. which functions are
  provided by a type-class?  (as opposed to which functions need to be
  implemented to belong to a type-class)
 
  On Fri, 2014-11-21 at 16:54, Jiahao Chen jia...@mit.edu wrote:
   If instead I want to say this new type acts like an Integer,
 there's
  no
   canonical place for me to find out what all the functions are I need
 to
   implement.
  
   The closest thing we have now is methodswith(Integer)
   and methodswith(Integer, true) (the latter gives also all the methods
  that
   Integer inherits from its supertypes).
  
   Thanks,
  
   Jiahao Chen
   Staff Research Scientist
   MIT Computer Science and Artificial Intelligence Laboratory
  
   On Fri, Nov 21, 2014 at 9:54 AM, Sebastian Good 
   sebast...@palladiumconsulting.com wrote:
  
   I will look into Traits.jl -- interesting package.
  
   To get traction and some of the great power of comparability, the
 base
   library will need to be carefully decomposed into traits, which (as
  noted
   in some of the issue conversations on github) takes you straight to
 the
   great research Haskell is doing in this area.
  
   *Sebastian Good*
  
  
   On Fri, Nov 21, 2014 at 9:38 AM, John Myles White 
   johnmyleswh...@gmail.com wrote:
  
   This sounds a bit like a mix of two problems:
  
   (1) A lack of interfaces:
  
- a) A lack of formal interfaces, which will hopefully be
 addressed by
   something like Traits.jl at some point. (
   https://github.com/JuliaLang/julia/issues/6975)
  
- b) A lack of documentation for informal interfaces, such as the
   methods that AbstractArray objects must implement.
  
   (2) A lack of delegation when you make wrapper types:
   https://github.com/JuliaLang/julia/pull/3292
  
   The first has moved forward a bunch thanks to Mauro's work. The
 second
   has not gotten much further, although Kevin Squire wrote a different
   delegate macro that's noticeably better than the draft I wrote.
  
-- John
  
   On Nov 21, 2014, at 2:31 PM, Sebastian Good 
   sebast...@palladiumconsulting.com wrote:
  
   In implementing new kinds of numbers, I've found it difficult to
 know
   just how many functions I need to implement for the general library
 to
   just work on them. Take as an example a byte-swapped, e.g.
  big-endian,
   integer. This is handy when doing memory-mapped I/O on a file with
 data
   written in network order. It would be nice to just implement, say,
   Int32BigEndian and have it act like a real number. (Then I could
 just
   reinterpret a mmaped array and work directly off it) In general,
 we'd
   convert to Int32 at the earliest opportunity we had. For instance
 the
   following macro introduces a new type which claims to be derived
 from
   $base_type, and implements conversions and promotion rules to get it
  into a
   native form ($n_type) whenever it's used.
  
   macro encoded_bitstype(name, base_type, bits_type, n_type, to_n,
  from_n)
   quote
   immutable $name : $base_type
   bits::$bits_type
   end
  
   Base.bits(x::$name) = bits(x.bits)
   Base.bswap(x

Re: [julia-users] Re: Dimension Independent Array Access

2014-11-18 Thread Sebastian Good

the soup has not yet been assembled

... meaning that array subscripting with ranges will eventually result in 
views on those arrays rather than copies?


Re: [julia-users] Reinterpreting parts of a byte array

2014-11-10 Thread Sebastian Good
Thanks for the responses. As usual, I discover myself making assumptions
that may not have been stated well.

1. I'll be reading small bits (32 bit ints, mostly) at fairly random
addresses and was worried about the overhead of creating array views for
such small objects. Perhaps they are optimized away. I should check :-)
2. I've been taught by other languages that touching raw pointers is
dangerous without also holding some promise that they won't be relocated,
e.g. by a copying collector, etc. I suppose if it's a memory mapped array,
I can roughly cheat and know that the OS won't move it, so Julia can't
either. But it worried me.

*Sebastian Good*


On Sun, Nov 9, 2014 at 11:36 PM, Jameson Nash vtjn...@gmail.com wrote:

 It rather depends upon what you know about the data. If you want a
 file-like abstraction, it may be possible to wrap it in an IOBuffer type
 (if not, it should be parameterized to allow it). If you want an array-like
 abstraction, then I think reinterpreting to different array types may be
 the most direct approach. If the array is coming from C, then you can use
 unsafe_load/unsafe_store directly. As Ivar points out, this is not more nor
 less dangerous than the same operation in C. Although, if you wrap the data
 buffer in a Julia object (or got it from a Julia call), you can gain some
 element of protection against memory corruption bugs by minimizing the
 amount of julia code that is directly interfacing with the raw memory
 pointer.


 On Sun Nov 09 2014 at 5:42:42 PM Ivar Nesje iva...@gmail.com wrote:

 Is there any problem with reinterpreting the array and then use a
 SubArray or ArrayView to do the index transformation?

 Pointer arithmetic is not more or less dangerous in Julia, than what it
 is in C. The only thing you need to ensure is that the object you have a
 pointer to is referenced by something the GC traverses, and that it isn't
 moved in memory (Eg. vector resize).




[julia-users] Reinterpreting parts of a byte array

2014-11-09 Thread Sebastian Good
If I have a large byte array (e.g. from memory mapping) and wish to read a 
bits type from it at an arbitrary offset, is the idiomatic method to cast a 
Pointer{Uint8} at the right offset to the Pointer{T} I need? Or is there 
another method? It seems an overload of reinterpret would be more what I 
expect, e.g. reinterpret(T, array, 7) Unmasking the naked pointer seems 
dangerous to me from a GC perspective.


[julia-users] row-wise dot product

2014-11-06 Thread Sebastian Good
Working through the excellent coursera machine-learning course, I found
myself using the row-wise (axis-wise) dot product in Octave, but found
there was no obvious equivalent in Julia.

In Octave/Matlab, one can call dot(a,b,2) to get the row-wise dot product
of two mxn matrices, returned as a new column vector of size mx1.

Even though Julia makes for loops faster, I like sum(dot(a,b,2)) for its
concision over the equivalent array comprehension or explicit for loop.

Hopefully I'm just missing an overload or alternate name?


[julia-users] Re: row-wise dot product

2014-11-06 Thread Sebastian Good
A good solution for this particular problem, though presumably uses more 
memory than a dedicated axis-aware dot product method. Thanks!

On Thursday, November 6, 2014 10:42:26 AM UTC-5, Douglas Bates wrote:



 On Thursday, November 6, 2014 9:14:10 AM UTC-6, Sebastian Good wrote:

 Working through the excellent coursera machine-learning course, I found 
 myself using the row-wise (axis-wise) dot product in Octave, but found 
 there was no obvious equivalent in Julia. 

 In Octave/Matlab, one can call dot(a,b,2) to get the row-wise dot product 
 of two mxn matrices, returned as a new column vector of size mx1.

 Even though Julia makes for loops faster, I like sum(dot(a,b,2)) for its 
 concision over the equivalent array comprehension or explicit for loop.

 Hopefully I'm just missing an overload or alternate name?

  
  
 julia a = rand(10,4)
 10x4 Array{Float64,2}:
  0.134279  0.135088   0.331850.956108
  0.977812  0.219557   0.887589   0.468597
  0.69524   0.310889   0.449669   0.717189
  0.385896  0.675195   0.0810221  0.179553
  0.717348  0.138556   0.521470.458516
  0.821631  0.337048   0.367002   0.320554
  0.531433  0.0298744  0.344748   0.722242
  0.708596  0.550999   0.629017   0.787594
  0.803008  0.380515   0.729874   0.744713
  0.166205  0.5589 0.605327   0.246186

 julia b = randn(10,4)
 10x4 Array{Float64,2}:
   0.551047   -0.284285   -1.330480.0216755
  -1.16133-0.5525370.395243  -1.72303  
  -0.0181444  -0.481539   -0.985497   0.352999 
   1.20222-0.5579730.428804  -1.1013   
   2.31078 0.0909548   0.329372   0.651853 
   0.341906   -0.109811   -0.360118   0.550494 
   0.9886441.02413 0.570208   0.48143  
  -1.75465 0.147909   -1.351590.89136  
  -0.105066   -1.04501-0.682836   0.600948 
   0.556118   -1.24914-2.45667   -1.02942  

 julia sum(a .* b,2)
 10x1 Array{Float64,2}:
  -0.385205 
  -1.71347  
  -0.3523   
  -0.0758094
   2.14088  
   0.288209 
   1.10028  
  -1.30999  
  -0.532863 
  -2.34624  




Re: [julia-users] Re: row-wise dot product

2014-11-06 Thread Sebastian Good
Agreed! Axis-reducing dot product operators might be a reasonable addition
to the standard library, especially since BLAS provides the (presumably
highly optimized or even multi-threaded) dot product primitive, which a
library function could easily sub out to using the appropriate strides.

*Sebastian Good*


On Thu, Nov 6, 2014 at 1:10 PM, Douglas Bates dmba...@gmail.com wrote:

 On Thursday, November 6, 2014 10:48:26 AM UTC-6, Sebastian Good wrote:

 A good solution for this particular problem, though presumably uses more
 memory than a dedicated axis-aware dot product method.


 You are correct that the method I showed does create a matrix of the same
 size as `a` and `b` to evaluate the `.+` operation.  You can avoid doing so
 but, of course, the code becomes more opaque.  I think the point for me is
 that if `a` and `b` are so large that the allocation and freeing of the
 memory becomes problematic, I can write the space conserving version in
 Julia and get performance comparable to compiled code.  Lately when
 describing Julia to colleagues I mention the type system and multiple
 dispatch and several other aspects showing how well-designed Julia is.  But
 the point that I emphasize is one language, which sometimes I extend to
 One language to rule them all (I assume everyone is familiar with Lord
 of the Rings).  I can write Julia code at in a high-level, vectorized
 style (like R, Matlab/Octave) but I can also, if I need to, write low-level
 iterative code in Julia.  I don't need to use a compiled language write
 interface code.

 If I have very large arrays, perhaps even memory-mapped arrays because
 they are so large, I could define a function

 function rowdot{T}(a::DenseMatrix{T},b::DenseMatrix{T})
 ((m,n) = size(a)) == size(b) || throw(DimensionMismatch())
 res = zeros(T,(m,))
 for j in 1:n, i in 1:m
 res[i] += a[i,j] * b[i,j]
 end
 res
 end

 that avoided creating the temporary.  Once I convinced myself that there
 were no problems in the code (and my first version did indeed have a bug) I
 could change the loop to

 @simd for j in 1:n, i in 1:m
 @inbounds res[i] += a[i,j] * b[i,j]
 end

 and improve the performance.  In the next iteration I could use
 SharedArrays and parallelize the calculation if it really needed it.

 As a programmer I am grateful for the incredible freedom that Julia gives
 me to get as obsessive compulsive about performance as I want.

 Thanks!


 You're welcome.



 On Thursday, November 6, 2014 10:42:26 AM UTC-5, Douglas Bates wrote:



 On Thursday, November 6, 2014 9:14:10 AM UTC-6, Sebastian Good wrote:

 Working through the excellent coursera machine-learning course, I found
 myself using the row-wise (axis-wise) dot product in Octave, but found
 there was no obvious equivalent in Julia.

 In Octave/Matlab, one can call dot(a,b,2) to get the row-wise dot
 product of two mxn matrices, returned as a new column vector of size mx1.

 Even though Julia makes for loops faster, I like sum(dot(a,b,2)) for
 its concision over the equivalent array comprehension or explicit for loop.

 Hopefully I'm just missing an overload or alternate name?



 julia a = rand(10,4)
 10x4 Array{Float64,2}:
  0.134279  0.135088   0.331850.956108
  0.977812  0.219557   0.887589   0.468597
  0.69524   0.310889   0.449669   0.717189
  0.385896  0.675195   0.0810221  0.179553
  0.717348  0.138556   0.521470.458516
  0.821631  0.337048   0.367002   0.320554
  0.531433  0.0298744  0.344748   0.722242
  0.708596  0.550999   0.629017   0.787594
  0.803008  0.380515   0.729874   0.744713
  0.166205  0.5589 0.605327   0.246186

 julia b = randn(10,4)
 10x4 Array{Float64,2}:
   0.551047   -0.284285   -1.330480.0216755
  -1.16133-0.5525370.395243  -1.72303
  -0.0181444  -0.481539   -0.985497   0.352999
   1.20222-0.5579730.428804  -1.1013
   2.31078 0.0909548   0.329372   0.651853
   0.341906   -0.109811   -0.360118   0.550494
   0.9886441.02413 0.570208   0.48143
  -1.75465 0.147909   -1.351590.89136
  -0.105066   -1.04501-0.682836   0.600948
   0.556118   -1.24914-2.45667   -1.02942

 julia sum(a .* b,2)
 10x1 Array{Float64,2}:
  -0.385205
  -1.71347
  -0.3523
  -0.0758094
   2.14088
   0.288209
   1.10028
  -1.30999
  -0.532863
  -2.34624





[julia-users] map performance

2014-10-21 Thread Sebastian Good
I created an array `i` of 500M 32-bit ints, then converted the whole array 
(into a new array) to float32s. The naive call to `float32(i)` ran at one 
speed, but map at quite another!

julia i = Int32[1:5];

julia @time float32(i);
elapsed time: 1.669573141 seconds (200128 bytes allocated)

julia @time map(float32, i);
elapsed time: 27.546797783 seconds (1791952 bytes allocated, 25.93% gc 
time)

From looking at the code in AbstractArray.jl it was executing, it would 
appear that it spends a lot of time doing type checking that it doesn't 
need to be doing, and presumably allocating memory as well. I'm not sure if 
this is a general problem that can be fixed with a more specialized method 
in AbstractArray, since we can't prove the map function has the right type 
as far as I understand Julia's type system.

The guts of the map seem to call map_to!, which takes a Callable as its 
first argument.

function map_to!{T}(f::Callable, offs, dest::AbstractArray{T}, A::
AbstractArray)

In an ideal world, it would take something more like the following, which 
would allow elimination of the run-time checks

function map_to!{T,U}(f::Func{T,U}, offs, dest::AbstractArray{U}, A::
AbstractArray{T})

But this doesn't seem like something the type system is set up to prove. Am 
I on the wrong track? Is this something that is queued for development at a 
later stage?



[julia-users] Re: map performance

2014-10-21 Thread Sebastian Good
(FWIW, the performance of the float32(i) call was essentially identical to 
the performance of the hand code for loop equivalent in both Julia and 
C++/clang. Stepping out to Intel SPMD (ispc) compiler to ensure maximum 
vectorization only made it 20% faster. Nicely done!)

On Tuesday, October 21, 2014 10:36:42 PM UTC-4, Sebastian Good wrote:

 I created an array `i` of 500M 32-bit ints, then converted the whole array 
 (into a new array) to float32s. The naive call to `float32(i)` ran at one 
 speed, but map at quite another!

 julia i = Int32[1:5];

 julia @time float32(i);
 elapsed time: 1.669573141 seconds (200128 bytes allocated)

 julia @time map(float32, i);
 elapsed time: 27.546797783 seconds (1791952 bytes allocated, 25.93% gc 
 time)

 From looking at the code in AbstractArray.jl it was executing, it would 
 appear that it spends a lot of time doing type checking that it doesn't 
 need to be doing, and presumably allocating memory as well. I'm not sure if 
 this is a general problem that can be fixed with a more specialized method 
 in AbstractArray, since we can't prove the map function has the right type 
 as far as I understand Julia's type system.

 The guts of the map seem to call map_to!, which takes a Callable as its 
 first argument.

 function map_to!{T}(f::Callable, offs, dest::AbstractArray{T}, A::
 AbstractArray)

 In an ideal world, it would take something more like the following, which 
 would allow elimination of the run-time checks

 function map_to!{T,U}(f::Func{T,U}, offs, dest::AbstractArray{U}, A::
 AbstractArray{T})

 But this doesn't seem like something the type system is set up to prove. 
 Am I on the wrong track? Is this something that is queued for development 
 at a later stage?



[julia-users] Surprising machine code for bitstype

2014-10-20 Thread Sebastian Good
When I've run benchmarks with custom bitstypes, they seem to run very 
quickly. But I wouldn't have guessed it from the machine code I can preview 
at the REPL. Can anyone explain what I'm seeing here? (I'm dumping 
@code_llvm as it's more instructive; the @code_native is very large). 
Converting a constant number to a Uint8 is a one-operation function. 
Converting a constant number to a bitstype of the same size seems to do a 
lot more.

julia @code_llvm convert(Uint8, 100)

define i8 @julia_convert;19720(%jl_value_t*, i64) {
top:
  %2 = trunc i64 %1 to i8, !dbg !772, !julia_type !773
  ret i8 %2, !dbg !772
}

julia bitstype 8 Foo
julia @code_llvm convert(Foo, 100)

; Function Attrs: noreturn
define void @julia_convert;19725(%jl_value_t*, i64) #0 {
top:
  %2 = alloca [5 x %jl_value_t*], align 8
  %.sub = getelementptr inbounds [5 x %jl_value_t*]* %2, i64 0, i64 0
  %3 = getelementptr [5 x %jl_value_t*]* %2, i64 0, i64 2, !dbg !776
  store %jl_value_t* inttoptr (i64 6 to %jl_value_t*), %jl_value_t** %.sub, 
align 8
  %4 = load %jl_value_t*** @jl_pgcstack, align 8, !dbg !776
  %5 = getelementptr [5 x %jl_value_t*]* %2, i64 0, i64 1, !dbg !776
  %.c = bitcast %jl_value_t** %4 to %jl_value_t*, !dbg !776
  store %jl_value_t* %.c, %jl_value_t** %5, align 8, !dbg !776
  store %jl_value_t** %.sub, %jl_value_t*** @jl_pgcstack, align 8, !dbg !776
  store %jl_value_t* null, %jl_value_t** %3, align 8
  %6 = getelementptr [5 x %jl_value_t*]* %2, i64 0, i64 3
  store %jl_value_t* null, %jl_value_t** %6, align 8
  %7 = getelementptr [5 x %jl_value_t*]* %2, i64 0, i64 4
  store %jl_value_t* null, %jl_value_t** %7, align 8
  %8 = load %jl_value_t** inttoptr (i64 140384830597872 to %jl_value_t**), 
align 16, !dbg !777
  %9 = getelementptr inbounds %jl_value_t* %8, i64 1, i32 0, !dbg !777
  %10 = load %jl_value_t** %9, align 8, !dbg !777, !tbaa %jtbaa_func
  %11 = bitcast %jl_value_t* %10 to %jl_value_t* (%jl_value_t*, 
%jl_value_t**, i32)*, !dbg !777
  store %jl_value_t* %0, %jl_value_t** %3, align 8, !dbg !777
  %12 = call %jl_value_t* @jl_box_int64(i64 %1), !dbg !777
  store %jl_value_t* %12, %jl_value_t** %6, align 8, !dbg !777
  %13 = load %jl_value_t** inttoptr (i64 140384825150528 to %jl_value_t**), 
align 64, !dbg !777
  store %jl_value_t* %13, %jl_value_t** %7, align 8, !dbg !777
  %14 = call %jl_value_t* %11(%jl_value_t* %8, %jl_value_t** %3, i32 3), 
!dbg !777
  %15 = load %jl_value_t** %5, align 8, !dbg !777
  %16 = getelementptr inbounds %jl_value_t* %15, i64 0, i32 0, !dbg !777
  store %jl_value_t** %16, %jl_value_t*** @jl_pgcstack, align 8, !dbg !777
  ret void, !dbg !777
}



Re: [julia-users] Surprising machine code for bitstype

2014-10-20 Thread Sebastian Good
Your'e right! Sloppy case-minimizing-and-pasting; my apologies. I forgot to
define the conversion. Try this

julia import Base.convert

julia bitstype 8 Foo

julia convert(::Type{Foo}, x::Uint8) = reinterpret(Foo, x)
convert (generic function with 443 methods)

julia @code_llvm convert(Uint8, 0x10)

define i8 @julia_convert;19713(%jl_value_t*, i8) {
top:
  ret i8 %1, !dbg !741
}

julia @code_llvm convert(Foo, 0x10)

define i8 @julia_convert;19716(%jl_value_t*, i8) {
top:
  %2 = alloca [4 x %jl_value_t*], align 8
  %.sub = getelementptr inbounds [4 x %jl_value_t*]* %2, i64 0, i64 0
  store %jl_value_t* inttoptr (i64 4 to %jl_value_t*), %jl_value_t** %.sub,
align 8
  %3 = load %jl_value_t*** @jl_pgcstack, align 8, !dbg !751
  %4 = getelementptr [4 x %jl_value_t*]* %2, i64 0, i64 1, !dbg !751
  %.c = bitcast %jl_value_t** %3 to %jl_value_t*, !dbg !751
  store %jl_value_t* %.c, %jl_value_t** %4, align 8, !dbg !751
  store %jl_value_t** %.sub, %jl_value_t*** @jl_pgcstack, align 8, !dbg !751
  %5 = getelementptr [4 x %jl_value_t*]* %2, i64 0, i64 2
  store %jl_value_t* null, %jl_value_t** %5, align 8
  %6 = getelementptr [4 x %jl_value_t*]* %2, i64 0, i64 3
  store %jl_value_t* null, %jl_value_t** %6, align 8
  %7 = load %jl_value_t** %4, align 8, !dbg !752
  %8 = getelementptr inbounds %jl_value_t* %7, i64 0, i32 0, !dbg !752
  store %jl_value_t** %8, %jl_value_t*** @jl_pgcstack, align 8, !dbg !752
  ret i8 %1, !dbg !752
}

julia convert(Foo, 0x10)
Foo(0x10)

julia convert(Uint8, 0x10)
0x10


*Sebastian Good*


On Mon, Oct 20, 2014 at 11:29 AM, Stefan Karpinski ste...@karpinski.org
wrote:

 julia bitstype 8 Foo

 julia convert(Foo, 100)
 ERROR: `convert` has no method matching convert(::Type{Foo}, ::Int64)
  in convert at base.jl:9

 How fast do you want to raise an error? ;-)

 On Mon, Oct 20, 2014 at 11:18 AM, Sebastian Good 
 sebast...@palladiumconsulting.com wrote:

 When I've run benchmarks with custom bitstypes, they seem to run very
 quickly. But I wouldn't have guessed it from the machine code I can preview
 at the REPL. Can anyone explain what I'm seeing here? (I'm dumping
 @code_llvm as it's more instructive; the @code_native is very large).
 Converting a constant number to a Uint8 is a one-operation function.
 Converting a constant number to a bitstype of the same size seems to do a
 lot more.

 julia @code_llvm convert(Uint8, 100)

 define i8 @julia_convert;19720(%jl_value_t*, i64) {
 top:
   %2 = trunc i64 %1 to i8, !dbg !772, !julia_type !773
   ret i8 %2, !dbg !772
 }

 julia bitstype 8 Foo
 julia @code_llvm convert(Foo, 100)

 ; Function Attrs: noreturn
 define void @julia_convert;19725(%jl_value_t*, i64) #0 {
 top:
   %2 = alloca [5 x %jl_value_t*], align 8
   %.sub = getelementptr inbounds [5 x %jl_value_t*]* %2, i64 0, i64 0
   %3 = getelementptr [5 x %jl_value_t*]* %2, i64 0, i64 2, !dbg !776
   store %jl_value_t* inttoptr (i64 6 to %jl_value_t*), %jl_value_t**
 %.sub, align 8
   %4 = load %jl_value_t*** @jl_pgcstack, align 8, !dbg !776
   %5 = getelementptr [5 x %jl_value_t*]* %2, i64 0, i64 1, !dbg !776
   %.c = bitcast %jl_value_t** %4 to %jl_value_t*, !dbg !776
   store %jl_value_t* %.c, %jl_value_t** %5, align 8, !dbg !776
   store %jl_value_t** %.sub, %jl_value_t*** @jl_pgcstack, align 8, !dbg
 !776
   store %jl_value_t* null, %jl_value_t** %3, align 8
   %6 = getelementptr [5 x %jl_value_t*]* %2, i64 0, i64 3
   store %jl_value_t* null, %jl_value_t** %6, align 8
   %7 = getelementptr [5 x %jl_value_t*]* %2, i64 0, i64 4
   store %jl_value_t* null, %jl_value_t** %7, align 8
   %8 = load %jl_value_t** inttoptr (i64 140384830597872 to
 %jl_value_t**), align 16, !dbg !777
   %9 = getelementptr inbounds %jl_value_t* %8, i64 1, i32 0, !dbg !777
   %10 = load %jl_value_t** %9, align 8, !dbg !777, !tbaa %jtbaa_func
   %11 = bitcast %jl_value_t* %10 to %jl_value_t* (%jl_value_t*,
 %jl_value_t**, i32)*, !dbg !777
   store %jl_value_t* %0, %jl_value_t** %3, align 8, !dbg !777
   %12 = call %jl_value_t* @jl_box_int64(i64 %1), !dbg !777
   store %jl_value_t* %12, %jl_value_t** %6, align 8, !dbg !777
   %13 = load %jl_value_t** inttoptr (i64 140384825150528 to
 %jl_value_t**), align 64, !dbg !777
   store %jl_value_t* %13, %jl_value_t** %7, align 8, !dbg !777
   %14 = call %jl_value_t* %11(%jl_value_t* %8, %jl_value_t** %3, i32 3),
 !dbg !777
   %15 = load %jl_value_t** %5, align 8, !dbg !777
   %16 = getelementptr inbounds %jl_value_t* %15, i64 0, i32 0, !dbg !777
   store %jl_value_t** %16, %jl_value_t*** @jl_pgcstack, align 8, !dbg !777
   ret void, !dbg !777
 }





Re: [julia-users] Surprising machine code for bitstype

2014-10-20 Thread Sebastian Good
https://github.com/JuliaLang/julia/issues/8742

Let me know if it needs any changes. FWIW I'm on Julia 0.3.1, OS X.


Cheers!

*Sebastian Good*


On Mon, Oct 20, 2014 at 11:43 AM, Jameson Nash vtjn...@gmail.com wrote:

 Can you open an issue? It looks like it's still emitting a gc-frame, even
 though it isn't using it.


 On Monday, October 20, 2014, Sebastian Good 
 sebast...@palladiumconsulting.com wrote:

 Your'e right! Sloppy case-minimizing-and-pasting; my apologies. I forgot
 to define the conversion. Try this

 julia import Base.convert

 julia bitstype 8 Foo

 julia convert(::Type{Foo}, x::Uint8) = reinterpret(Foo, x)
 convert (generic function with 443 methods)

 julia @code_llvm convert(Uint8, 0x10)

 define i8 @julia_convert;19713(%jl_value_t*, i8) {
 top:
   ret i8 %1, !dbg !741
 }

 julia @code_llvm convert(Foo, 0x10)

 define i8 @julia_convert;19716(%jl_value_t*, i8) {
 top:
   %2 = alloca [4 x %jl_value_t*], align 8
   %.sub = getelementptr inbounds [4 x %jl_value_t*]* %2, i64 0, i64 0
   store %jl_value_t* inttoptr (i64 4 to %jl_value_t*), %jl_value_t**
 %.sub, align 8
   %3 = load %jl_value_t*** @jl_pgcstack, align 8, !dbg !751
   %4 = getelementptr [4 x %jl_value_t*]* %2, i64 0, i64 1, !dbg !751
   %.c = bitcast %jl_value_t** %3 to %jl_value_t*, !dbg !751
   store %jl_value_t* %.c, %jl_value_t** %4, align 8, !dbg !751
   store %jl_value_t** %.sub, %jl_value_t*** @jl_pgcstack, align 8, !dbg
 !751
   %5 = getelementptr [4 x %jl_value_t*]* %2, i64 0, i64 2
   store %jl_value_t* null, %jl_value_t** %5, align 8
   %6 = getelementptr [4 x %jl_value_t*]* %2, i64 0, i64 3
   store %jl_value_t* null, %jl_value_t** %6, align 8
   %7 = load %jl_value_t** %4, align 8, !dbg !752
   %8 = getelementptr inbounds %jl_value_t* %7, i64 0, i32 0, !dbg !752
   store %jl_value_t** %8, %jl_value_t*** @jl_pgcstack, align 8, !dbg !752
   ret i8 %1, !dbg !752
 }

 julia convert(Foo, 0x10)
 Foo(0x10)

 julia convert(Uint8, 0x10)
 0x10


 *Sebastian Good*


 On Mon, Oct 20, 2014 at 11:29 AM, Stefan Karpinski ste...@karpinski.org
 wrote:

 julia bitstype 8 Foo

 julia convert(Foo, 100)
 ERROR: `convert` has no method matching convert(::Type{Foo}, ::Int64)
  in convert at base.jl:9

 How fast do you want to raise an error? ;-)

 On Mon, Oct 20, 2014 at 11:18 AM, Sebastian Good 
 sebast...@palladiumconsulting.com wrote:

 When I've run benchmarks with custom bitstypes, they seem to run very
 quickly. But I wouldn't have guessed it from the machine code I can preview
 at the REPL. Can anyone explain what I'm seeing here? (I'm dumping
 @code_llvm as it's more instructive; the @code_native is very large).
 Converting a constant number to a Uint8 is a one-operation function.
 Converting a constant number to a bitstype of the same size seems to do a
 lot more.

 julia @code_llvm convert(Uint8, 100)

 define i8 @julia_convert;19720(%jl_value_t*, i64) {
 top:
   %2 = trunc i64 %1 to i8, !dbg !772, !julia_type !773
   ret i8 %2, !dbg !772
 }

 julia bitstype 8 Foo
 julia @code_llvm convert(Foo, 100)

 ; Function Attrs: noreturn
 define void @julia_convert;19725(%jl_value_t*, i64) #0 {
 top:
   %2 = alloca [5 x %jl_value_t*], align 8
   %.sub = getelementptr inbounds [5 x %jl_value_t*]* %2, i64 0, i64 0
   %3 = getelementptr [5 x %jl_value_t*]* %2, i64 0, i64 2, !dbg !776
   store %jl_value_t* inttoptr (i64 6 to %jl_value_t*), %jl_value_t**
 %.sub, align 8
   %4 = load %jl_value_t*** @jl_pgcstack, align 8, !dbg !776
   %5 = getelementptr [5 x %jl_value_t*]* %2, i64 0, i64 1, !dbg !776
   %.c = bitcast %jl_value_t** %4 to %jl_value_t*, !dbg !776
   store %jl_value_t* %.c, %jl_value_t** %5, align 8, !dbg !776
   store %jl_value_t** %.sub, %jl_value_t*** @jl_pgcstack, align 8, !dbg
 !776
   store %jl_value_t* null, %jl_value_t** %3, align 8
   %6 = getelementptr [5 x %jl_value_t*]* %2, i64 0, i64 3
   store %jl_value_t* null, %jl_value_t** %6, align 8
   %7 = getelementptr [5 x %jl_value_t*]* %2, i64 0, i64 4
   store %jl_value_t* null, %jl_value_t** %7, align 8
   %8 = load %jl_value_t** inttoptr (i64 140384830597872 to
 %jl_value_t**), align 16, !dbg !777
   %9 = getelementptr inbounds %jl_value_t* %8, i64 1, i32 0, !dbg !777
   %10 = load %jl_value_t** %9, align 8, !dbg !777, !tbaa %jtbaa_func
   %11 = bitcast %jl_value_t* %10 to %jl_value_t* (%jl_value_t*,
 %jl_value_t**, i32)*, !dbg !777
   store %jl_value_t* %0, %jl_value_t** %3, align 8, !dbg !777
   %12 = call %jl_value_t* @jl_box_int64(i64 %1), !dbg !777
   store %jl_value_t* %12, %jl_value_t** %6, align 8, !dbg !777
   %13 = load %jl_value_t** inttoptr (i64 140384825150528 to
 %jl_value_t**), align 64, !dbg !777
   store %jl_value_t* %13, %jl_value_t** %7, align 8, !dbg !777
   %14 = call %jl_value_t* %11(%jl_value_t* %8, %jl_value_t** %3, i32
 3), !dbg !777
   %15 = load %jl_value_t** %5, align 8, !dbg !777
   %16 = getelementptr inbounds %jl_value_t* %15, i64 0, i32 0, !dbg !777
   store %jl_value_t** %16, %jl_value_t*** @jl_pgcstack, align 8, !dbg
 !777
   ret void

[julia-users] Re: Geometry package

2014-10-01 Thread Sebastian Good
If you're focused on 3D cartesian space, would some of the existing 
graphics-focused packages do the trick? these might be easily wrapped in 
Julia.

On Tuesday, September 30, 2014 8:05:51 PM UTC-4, Yakir Gagnon wrote:

 A Geometry package has been discussed here 
 https://groups.google.com/d/topic/julia-dev/vZpZ8NBX_z8/discussion and 
 here https://groups.google.com/d/topic/julia-dev/fqwnyOojRdg/discussion 
 (Spherical, 
 Geographic, Homogeneous coordinate systems in 1 to 4 dimensions and more).

 I'm in need of something significantly simpler: a Geometry module for a 
 Cartesian coordinate system in 3D with Point, Direction, and Line types. 
 Has anyone written something like that? Maybe that ambitious Geometry 
 package is further along or been announced and I missed it?

 Thanks a bunch!



Re: [julia-users] Re: Article on `@simd`

2014-09-24 Thread Sebastian Good
I've been thinking about this a bit, and as usual, Julia's multiple
dispatch might make such a thing possible in a novel way. The heart of ISPC
is allowing a function that looks like

int addScalar (int a, int b) { return a + b; }

effectively be

vectorint addVector (vectorint a, vectorint b) { return /*AVX version
of */a + b; }

This is what vectorizing compilers do, but they don't handle control flow
like ISPC does. Also, ISPCs foreach and foreach_tiled allow these
vectorized functions to be consumed more efficiently, for instance by
handling the ragged/unaligned front and back of arrays with scalar
versions, and the middle bits with vectorized functions.

With support for hardware vectors in Julia, you can start to imagine
writing macros that automatically generate the relevant functions, e.g.
generating AddVector from addScalar. However, to do anything cleverer than
the (already extremely clever) LLVM vectorizer, you have to expose masking
operations. To handle incoherent/divergent control flow, you issue vector
operations that are masked, allowing some lanes of the vector to stop
participating in the program for a period.  In a contrived example

int addScalar(int a, int b) { return a % 2 ? a + b : a - b; }

would be turned into something like the below

vectorint addVector(vectorint a, vectorint b) {
  mask = all; // a register with all 1s, indicating all lanes participate
  int mod = a % 2; // vectorized, using mask
  mask = maskwhere(mod != 0);
  vectorint result = a + b; // vectorized, using mask
  mask = invert(mask);
  result = a - b; // vectorized, using mask
  return result;
}

If you look at it closely, you've got versions generated for each function
that are
- scalar
- vector-enabled, but for arbitrary length vectors
- specialized for (one or more hardware) vector sizes
- specialized by alignment (as vector sizes get bigger, e.g. the 32- and
64-byte AVX versions coming out, you can't just rely on the runtime to
align everything properly, it will be too wasteful)

So, I think it's a big ask, but I think it could be produced incrementally.
We'd need help from the Julia language/standard library itself to expose
masked vector operations.


*Sebastian Good*


On Tue, Sep 23, 2014 at 2:52 PM, Jeff Waller truth...@gmail.com wrote:

 Could this theoretical thing be approached incrementally?  Meaning here's
 a project and he's some intermediate results and now it's 1.5x faster, and
 now he's something better and it's 2.7 all the while the goal is apparent
 but difficult.

 Or would it kind of be all works or doesn't?



Re: [julia-users] Re: Article on `@simd`

2014-09-24 Thread Sebastian Good
... though I suspect to really profit from masked vectorization like this,
it needs to be tackled at a much lower level in the compiler, likely even
as an LLVM optimization pass, guided only by some hints from Julia itself.

*Sebastian Good*


On Wed, Sep 24, 2014 at 10:16 AM, Sebastian Good 
sebast...@palladiumconsulting.com wrote:

 I've been thinking about this a bit, and as usual, Julia's multiple
 dispatch might make such a thing possible in a novel way. The heart of ISPC
 is allowing a function that looks like

 int addScalar (int a, int b) { return a + b; }

 effectively be

 vectorint addVector (vectorint a, vectorint b) { return /*AVX
 version of */a + b; }

 This is what vectorizing compilers do, but they don't handle control flow
 like ISPC does. Also, ISPCs foreach and foreach_tiled allow these
 vectorized functions to be consumed more efficiently, for instance by
 handling the ragged/unaligned front and back of arrays with scalar
 versions, and the middle bits with vectorized functions.

 With support for hardware vectors in Julia, you can start to imagine
 writing macros that automatically generate the relevant functions, e.g.
 generating AddVector from addScalar. However, to do anything cleverer than
 the (already extremely clever) LLVM vectorizer, you have to expose masking
 operations. To handle incoherent/divergent control flow, you issue vector
 operations that are masked, allowing some lanes of the vector to stop
 participating in the program for a period.  In a contrived example

 int addScalar(int a, int b) { return a % 2 ? a + b : a - b; }

 would be turned into something like the below

 vectorint addVector(vectorint a, vectorint b) {
   mask = all; // a register with all 1s, indicating all lanes participate
   int mod = a % 2; // vectorized, using mask
   mask = maskwhere(mod != 0);
   vectorint result = a + b; // vectorized, using mask
   mask = invert(mask);
   result = a - b; // vectorized, using mask
   return result;
 }

 If you look at it closely, you've got versions generated for each function
 that are
 - scalar
 - vector-enabled, but for arbitrary length vectors
 - specialized for (one or more hardware) vector sizes
 - specialized by alignment (as vector sizes get bigger, e.g. the 32- and
 64-byte AVX versions coming out, you can't just rely on the runtime to
 align everything properly, it will be too wasteful)

 So, I think it's a big ask, but I think it could be produced
 incrementally. We'd need help from the Julia language/standard library
 itself to expose masked vector operations.


 *Sebastian Good*


 On Tue, Sep 23, 2014 at 2:52 PM, Jeff Waller truth...@gmail.com wrote:

 Could this theoretical thing be approached incrementally?  Meaning here's
 a project and he's some intermediate results and now it's 1.5x faster, and
 now he's something better and it's 2.7 all the while the goal is apparent
 but difficult.

 Or would it kind of be all works or doesn't?





Re: [julia-users] Re: Article on `@simd`

2014-09-24 Thread Sebastian Good
This is an important part. One of the most important pieces of
functionality in vectorizing compilers is explaining how and why they did
or didn't vectorize your code. It can be terrifically complicated to figure
out. With ISPC, the language is constrained such that everything can be
vectorized and so it's much easier to figure out. (Figuring out whether it
was a good idea or not is left to the programmer!)

*Sebastian Good*


On Wed, Sep 24, 2014 at 12:52 PM, Jake Bolewski jakebolew...@gmail.com
wrote:

 You couldn't really preserve the semantics as Julia is a much more dynamic
 language.  ISPC can do what it does because the kernel language is fairly
 restrictive.

 On Wednesday, September 24, 2014 11:30:56 AM UTC-4, Sebastian Good wrote:

 ... though I suspect to really profit from masked vectorization like
 this, it needs to be tackled at a much lower level in the compiler, likely
 even as an LLVM optimization pass, guided only by some hints from Julia
 itself.

 *Sebastian Good*


 On Wed, Sep 24, 2014 at 10:16 AM, Sebastian Good seba...@
 palladiumconsulting.com wrote:

 I've been thinking about this a bit, and as usual, Julia's multiple
 dispatch might make such a thing possible in a novel way. The heart of ISPC
 is allowing a function that looks like

 int addScalar (int a, int b) { return a + b; }

 effectively be

 vectorint addVector (vectorint a, vectorint b) { return /*AVX
 version of */a + b; }

 This is what vectorizing compilers do, but they don't handle control
 flow like ISPC does. Also, ISPCs foreach and foreach_tiled allow these
 vectorized functions to be consumed more efficiently, for instance by
 handling the ragged/unaligned front and back of arrays with scalar
 versions, and the middle bits with vectorized functions.

 With support for hardware vectors in Julia, you can start to imagine
 writing macros that automatically generate the relevant functions, e.g.
 generating AddVector from addScalar. However, to do anything cleverer than
 the (already extremely clever) LLVM vectorizer, you have to expose masking
 operations. To handle incoherent/divergent control flow, you issue vector
 operations that are masked, allowing some lanes of the vector to stop
 participating in the program for a period.  In a contrived example

 int addScalar(int a, int b) { return a % 2 ? a + b : a - b; }

 would be turned into something like the below

 vectorint addVector(vectorint a, vectorint b) {
   mask = all; // a register with all 1s, indicating all lanes participate
   int mod = a % 2; // vectorized, using mask
   mask = maskwhere(mod != 0);
   vectorint result = a + b; // vectorized, using mask
   mask = invert(mask);
   result = a - b; // vectorized, using mask
   return result;
 }

 If you look at it closely, you've got versions generated for each
 function that are
 - scalar
 - vector-enabled, but for arbitrary length vectors
 - specialized for (one or more hardware) vector sizes
 - specialized by alignment (as vector sizes get bigger, e.g. the 32- and
 64-byte AVX versions coming out, you can't just rely on the runtime to
 align everything properly, it will be too wasteful)

 So, I think it's a big ask, but I think it could be produced
 incrementally. We'd need help from the Julia language/standard library
 itself to expose masked vector operations.


 *Sebastian Good*


 On Tue, Sep 23, 2014 at 2:52 PM, Jeff Waller trut...@gmail.com wrote:

 Could this theoretical thing be approached incrementally?  Meaning
 here's a project and he's some intermediate results and now it's 1.5x
 faster, and now he's something better and it's 2.7 all the while the goal
 is apparent but difficult.

 Or would it kind of be all works or doesn't?






Re: [julia-users] Re: Article on `@simd`

2014-09-23 Thread Sebastian Good
Based on this thread, I spent a few days playing around with a toy 
algorithm in Julia and C++ (trying OpenLC  OpenMP), and finally ISPC.

My verdict? ISPC is nothing short of magical. While my code was easily 
parallelizable (working independently on each element of a large array), it 
was not readily vectorizable by the usual suspects (LLVM or g++) due to 
potential branch divergence. The inner loop contains several if statements 
and even a while loop. In practice, these branches are almost never taken, 
but their presence seems to sufficiently discourage vectorizers such that 
they don't attempt a transformation.

On my machine, a reasonably optimized C++/gcc 5.0 runs through a 440MB 
computation in about 380ms. (Julia's not far behind). Taking the inner loop 
functions and compiling them in ispc was almost entirely a copy/paste 
exercise. Some thought was required but far less than other approaches. My 
8-wide AVX-enabled Intel CPU now runs the same benchmark in 140ms, or 2.7 
times faster. I'm not a vector wizard, so perhaps it's possible to get much 
closer to the theoretical 8x speedup, but for minimal effort, unlocking the 
2.7x left otherwise idle in the processor seems like a tremendous thing.

Implementing something ispc-like as Julia macros would not be simple. It's 
sufficiently different than scalar code so as to require a different type 
system, differentiating between values which are inherently vectors 
(varying) and those which remain scalar (uniform). It has some new 
constructs (foreach, foreach_tiled, etc.). If you want to take explicit 
advantage of the vectorized code, then there are a large family of 
functions which give access to typical vector instructions (shuffle, 
rotate, scatter, etc.)

I think if you want scalar code to be automatically vectorized, then you 
just have to wait for state of the art to improve in LLVM. But if you're 
willing to to make what are often minor changes to your loop, I suspect 
Julia could help with a properly designed macro that applied ISPC-like 
transformations. It would be extremely powerful, but also expensive to 
build. This hypothetical cleverer @simd vector would be a very large 
compiler unto itself.



On Wednesday, September 17, 2014 8:58:11 PM UTC-4, Erik Schnetter wrote:

 On Wed, Sep 17, 2014 at 7:14 PM,  gael@gmail.com javascript: 
 wrote: 
  Slightly OT, but since I won't talk about it myself I don't feel this 
 will harm the current thread ... 
  
  
  I don't know if it can be of any help/use/interest for any of you but 
 some people (some at Intel) are actively working on SIMD use with LLVM: 
  
  https://ispc.github.io/index.html 
  
  But I really don't have the skills to tell you if they just wrote a 
 new C-like language that is autovectorizing well or if they do some even 
 smarter stuff to get maximum performances. 

 I think they are up to something clever. 

 If I read things correctly: ispc adds new keywords that describes the 
 memory layout (!) of data structures that are accessed via SIMD 
 instructions. There exist a few commonly-used data layout 
 optimizations that are generally necessary to achieve good performance 
 with SIMD code, called SOA or replicated or similar. Apparently, 
 ispc introduces respective keywords that automatically transform the 
 layout of data data structures. 

 I wonder whether something equivalent could be implemented via macros 
 in Julia. These would be macros acting on type declarations, not on 
 code. Presumably, these would be array- or structure-like data types, 
 and accessing them is then slightly more complex, so that one would 
 also need to automatically define respective iterators. Maybe there 
 could be a companion macro that acts on loops, so that the loops are 
 transformed (and simd'ized) the same way as the data types... 

 -erik 

 -- 
 Erik Schnetter schn...@cct.lsu.edu javascript: 
 http://www.perimeterinstitute.ca/personal/eschnetter/