Re: [julia-users] Re: Simple Finite Difference Methods

2014-12-03 Thread Jameson Nash
this stack overflow question indicates that there are two options (
http://stackoverflow.com/questions/153559/what-are-some-good-profilers-for-native-c-on-windows
)

https://software.intel.com/sites/default/files/managed/cd/92/Intel-VTune-AmplifierXE-2015-Product-Brief-072914.pdf
($900)
http://www.glowcode.com/summary.htm ($500)


On Wed Dec 03 2014 at 9:11:28 AM Stefan Karpinski ste...@karpinski.org
wrote:

 This seems nuts. There have to be good profilers on Windows – how do those
 work?

 On Tue, Dec 2, 2014 at 11:55 PM, Jameson Nash vtjn...@gmail.com wrote:

 (I forgot to mention, that, to be fair, the windows machine that was used
 to run this test was an underpowered dual-core hyperthreaded atom
 processor, whereas the linux and mac machines were pretty comparable Xeon
 and sandybridge machines, respectively. I only gave windows a factor of 2
 advantage in the above computation in my accounting for this gap)

 On Tue Dec 02 2014 at 10:50:20 PM Tim Holy tim.h...@gmail.com wrote:

 Wow, those are pathetically-slow backtraces. Since most of us don't have
 machines with 500 cores, I don't see anything we can do.

 --Tim

 On Wednesday, December 03, 2014 03:14:02 AM Jameson Nash wrote:
  you could copy the whole stack (typically only a few 100kb, max of
 8MB),
  then do the stack walk offline. if you could change the stack pages to
  copy-on-write, it may even not be too expensive.
 
  but this is the real problem:
 
  ```
 
  |__/   |  x86_64-linux-gnu
 
  julia @time for i=1:10^4 backtrace() end
  elapsed time: 2.789268693 seconds (3200320016 bytes allocated, 89.29%
 gc
  time)
  ```
 
  ```
 
  |__/   |  x86_64-apple-darwin14.0.0
 
  julia @time for i=1:10^4 backtrace() end
  elapsed time: 2.586410216 seconds (640048 bytes allocated, 89.96%
 gc
  time)
  ```
 
  ```
  jameson@julia:~/julia-win32$ ./usr/bin/julia.exe -E  @time for
 i=1:10^3
  backtrace() end 
  fixme:winsock:WS_EnterSingleProtocolW unknown Protocol 0x
  fixme:winsock:WS_EnterSingleProtocolW unknown Protocol 0x
  err:dbghelp_stabs:stabs_parse Unknown stab type 0x0a
  elapsed time: 22.6314386 seconds (320032016 bytes allocated, 1.51% gc
 time)
  ```
 
  ```
 
  |__/   |  i686-w64-mingw32
 
  julia @time for i=1:10^4 backtrace() end
  elapsed time: 69.243275608 seconds (3200320800 bytes allocated, 13.16%
 gc
  time)
  ```
 
  And yes, those gc fractions are verifiably correct. With gc_disable(),
 they
  execute in 1/10 of the time. So, that pretty much means you must take
 1/100
  of the samples if you want to preserve roughly the same slow down. On
  linux, I find the slowdown to be in the range of 2-5x, and consider
 that to
  be pretty reasonable, especially for what you're getting. If you took
 the
  same number of samples on windows, it would cause a 200-500x slowdown
 (give
  or take a few percent). If you wanted to offload this work to other
 cores
  to get the same level of accuracy and no more slowdown than linux, you
  would need a machine with 200-500 processors (give or take 2-5)!
 
  (I think I did those conversions correctly. However, since I just did
 them
  for the purposes of this email, sans calculator, and as I was typing,
 let
  me know if I made more than a factor of 2 error somewhere, or just
 have fun
  reading https://what-if.xkcd.com/84/ instead)
 
  On Tue Dec 02 2014 at 6:23:07 PM Tim Holy tim.h...@gmail.com wrote:
   On Tuesday, December 02, 2014 10:24:43 PM Jameson Nash wrote:
You can't profile a moving target. The thread must be frozen first
 to
ensure the stack trace doesn't change while attempting to record it
  
   Got it. I assume there's no good way to make a copy and then
 discard if
   the
   copy is bad?
  
   --Tim
  
On Tue, Dec 2, 2014 at 5:12 PM Tim Holy tim.h...@gmail.com
 wrote:
 If the work of walking the stack is done in the thread, why does
 it
  
   cause
  
 any
 slowdown of the main process?

 But of course the time it takes to complete the backtrace sets an
 upper
 limit
 on how frequently you can take a snapshot. In that case, though,
  
   couldn't
  
 you
 just have the thread always collecting backtraces?

 --Tim

 On Tuesday, December 02, 2014 09:54:17 PM Jameson Nash wrote:
  That's essentially what we do now. (Minus the busy wait part).
 The

 overhead

  is too high to run it any more frequently -- it already causes
 a
  significant performance penalty on the system, even at the much
  lower
  sample rate than linux. However, I suspect the truncated
 backtraces
  
   on
  
  win32 were exaggerating the effect somewhat -- that should not
 be as
  much
  of an issue now.
 
  Sure, windows lets you snoop on (and modify) the address space
 of
  any
  process, you just need to find the right handle.
 
  On Tue, Dec 2, 2014 at 2:18 PM Tim Holy tim.h...@gmail.com
 wrote:
   On 

Re: [julia-users] Re: Simple Finite Difference Methods

2014-12-03 Thread Tim Holy
Some potentially-interesting links (of which I understand very little):
http://stackoverflow.com/questions/860602/recommended-open-source-profilers#comment2363112_1137133
http://stackoverflow.com/questions/8406175/optimizing-stack-walking-performance

I can tell from this comment:
https://github.com/JuliaLang/julia/issues/2597#issuecomment-15159868
that you already know about this (and its negatives):
http://www.lenholgate.com/blog/2008/09/alternative-call-stack-capturing.html

--Tim


On Wednesday, December 03, 2014 02:25:22 PM Jameson Nash wrote:
 this stack overflow question indicates that there are two options (
 http://stackoverflow.com/questions/153559/what-are-some-good-profilers-for-n
 ative-c-on-windows )
 
 https://software.intel.com/sites/default/files/managed/cd/92/Intel-VTune-Amp
 lifierXE-2015-Product-Brief-072914.pdf ($900)
 http://www.glowcode.com/summary.htm ($500)
 
 
 On Wed Dec 03 2014 at 9:11:28 AM Stefan Karpinski ste...@karpinski.org
 
 wrote:
  This seems nuts. There have to be good profilers on Windows – how do those
  work?
  
  On Tue, Dec 2, 2014 at 11:55 PM, Jameson Nash vtjn...@gmail.com wrote:
  (I forgot to mention, that, to be fair, the windows machine that was used
  to run this test was an underpowered dual-core hyperthreaded atom
  processor, whereas the linux and mac machines were pretty comparable Xeon
  and sandybridge machines, respectively. I only gave windows a factor of 2
  advantage in the above computation in my accounting for this gap)
  
  On Tue Dec 02 2014 at 10:50:20 PM Tim Holy tim.h...@gmail.com wrote:
  Wow, those are pathetically-slow backtraces. Since most of us don't have
  machines with 500 cores, I don't see anything we can do.
  
  --Tim
  
  On Wednesday, December 03, 2014 03:14:02 AM Jameson Nash wrote:
   you could copy the whole stack (typically only a few 100kb, max of
  
  8MB),
  
   then do the stack walk offline. if you could change the stack pages to
   copy-on-write, it may even not be too expensive.
   
   but this is the real problem:
   
   ```
   
   |__/   |  x86_64-linux-gnu
   
   julia @time for i=1:10^4 backtrace() end
   elapsed time: 2.789268693 seconds (3200320016 bytes allocated, 89.29%
  
  gc
  
   time)
   ```
   
   ```
   
   |__/   |  x86_64-apple-darwin14.0.0
   
   julia @time for i=1:10^4 backtrace() end
   elapsed time: 2.586410216 seconds (640048 bytes allocated, 89.96%
  
  gc
  
   time)
   ```
   
   ```
   jameson@julia:~/julia-win32$ ./usr/bin/julia.exe -E  @time for
  
  i=1:10^3
  
   backtrace() end 
   fixme:winsock:WS_EnterSingleProtocolW unknown Protocol 0x
   fixme:winsock:WS_EnterSingleProtocolW unknown Protocol 0x
   err:dbghelp_stabs:stabs_parse Unknown stab type 0x0a
   elapsed time: 22.6314386 seconds (320032016 bytes allocated, 1.51% gc
  
  time)
  
   ```
   
   ```
   
   |__/   |  i686-w64-mingw32
   
   julia @time for i=1:10^4 backtrace() end
   elapsed time: 69.243275608 seconds (3200320800 bytes allocated, 13.16%
  
  gc
  
   time)
   ```
   
   And yes, those gc fractions are verifiably correct. With gc_disable(),
  
  they
  
   execute in 1/10 of the time. So, that pretty much means you must take
  
  1/100
  
   of the samples if you want to preserve roughly the same slow down. On
   linux, I find the slowdown to be in the range of 2-5x, and consider
  
  that to
  
   be pretty reasonable, especially for what you're getting. If you took
  
  the
  
   same number of samples on windows, it would cause a 200-500x slowdown
  
  (give
  
   or take a few percent). If you wanted to offload this work to other
  
  cores
  
   to get the same level of accuracy and no more slowdown than linux, you
   would need a machine with 200-500 processors (give or take 2-5)!
   
   (I think I did those conversions correctly. However, since I just did
  
  them
  
   for the purposes of this email, sans calculator, and as I was typing,
  
  let
  
   me know if I made more than a factor of 2 error somewhere, or just
  
  have fun
  
   reading https://what-if.xkcd.com/84/ instead)
   
   On Tue Dec 02 2014 at 6:23:07 PM Tim Holy tim.h...@gmail.com wrote:
On Tuesday, December 02, 2014 10:24:43 PM Jameson Nash wrote:
 You can't profile a moving target. The thread must be frozen first
  
  to
  
 ensure the stack trace doesn't change while attempting to record
 it

Got it. I assume there's no good way to make a copy and then
  
  discard if
  
the
copy is bad?

--Tim

 On Tue, Dec 2, 2014 at 5:12 PM Tim Holy tim.h...@gmail.com
  
  wrote:
  If the work of walking the stack is done in the thread, why does
  
  it
  
cause

  any
  slowdown of the main process?
  
  But of course the time it takes to complete the backtrace sets
  an
  upper
  limit
  on how frequently you can take a snapshot. In that case, though,

couldn't

  you
   

Re: [julia-users] Re: Simple Finite Difference Methods

2014-12-03 Thread Tim Holy
That's a pretty serious bummer. I can't believe anybody puts up with this.

Should we change the default initialization
https://github.com/JuliaLang/julia/blob/b1c99af9bdeef22e0999b28388597757541e2cc7/base/profile.jl#L44
so that, on Windows, it fires every 100ms or so? And add a note to the Profile 
docs?

--Tim

On Wednesday, December 03, 2014 03:37:59 PM Jameson Nash wrote:
 The suggestion apparently is to use Event Tracing for Windows (aka
 ptrace/dtrace). Yes, that is faster (on linux too...), but misses the point
 entirely of profiling user code.
 
 the other offerings typically wrap StalkWalk64 (as we do), and complain
 about how absurdly slow it is
 
 we used to use RtlCaptureStackBackTrace, but it often failed to give useful
 backtraces. I think it depends too heavily of walking the EBP pointer chain
 (which typically doesn't exist on x86_64). As it happens, the remaining
 suggestions fall into the category of well, obviously you should just
 write your own EBP (32-bit base pointer register) pointer chain walk
 algorithm. here, I'll even write part of it for you ... which would be
 very helpful, if RBP (64-bit base pointer register) was used to make stack
 frame chains (hint: it isn't). and these days, the EBP isn't used to make
 stack pointer chains on x86 either.
 
 llvm3.5 contains the ability to interpret COFF files, so you could
 presumably write your own stack-walk algorithm. i don't recommend it,
 however. you might have to call out to StalkWalk anyways to access the
 microsoft symbol server (yes, off their network servers) to complete the
 stalk walk symbol lookup correctly
 
 On Wed Dec 03 2014 at 10:04:19 AM Tim Holy tim.h...@gmail.com wrote:
  Some potentially-interesting links (of which I understand very little):
  http://stackoverflow.com/questions/860602/recommended-open-  
  source-profilers#comment2363112_1137133
  http://stackoverflow.com/questions/8406175/optimizing-stack-  
  walking-performance
  
  I can tell from this comment:
  https://github.com/JuliaLang/julia/issues/2597#issuecomment-15159868
  that you already know about this (and its negatives):
  http://www.lenholgate.com/blog/2008/09/alternative-call-stac
  k-capturing.html
  
  --Tim
  
  On Wednesday, December 03, 2014 02:25:22 PM Jameson Nash wrote:
   this stack overflow question indicates that there are two options (
   http://stackoverflow.com/questions/153559/what-are-some-  
  good-profilers-for-n
  
   ative-c-on-windows )
   
   https://software.intel.com/sites/default/files/managed/cd/
  
  92/Intel-VTune-Amp
  
   lifierXE-2015-Product-Brief-072914.pdf ($900)
   http://www.glowcode.com/summary.htm ($500)
   
   
   On Wed Dec 03 2014 at 9:11:28 AM Stefan Karpinski ste...@karpinski.org
   
   wrote:
This seems nuts. There have to be good profilers on Windows – how do
  
  those
  
work?

On Tue, Dec 2, 2014 at 11:55 PM, Jameson Nash vtjn...@gmail.com
  
  wrote:
(I forgot to mention, that, to be fair, the windows machine that was
  
  used
  
to run this test was an underpowered dual-core hyperthreaded atom
processor, whereas the linux and mac machines were pretty comparable
  
  Xeon
  
and sandybridge machines, respectively. I only gave windows a factor
  
  of 2
  
advantage in the above computation in my accounting for this gap)

On Tue Dec 02 2014 at 10:50:20 PM Tim Holy tim.h...@gmail.com
  
  wrote:
Wow, those are pathetically-slow backtraces. Since most of us don't
  
  have
  
machines with 500 cores, I don't see anything we can do.

--Tim

On Wednesday, December 03, 2014 03:14:02 AM Jameson Nash wrote:
 you could copy the whole stack (typically only a few 100kb, max of

8MB),

 then do the stack walk offline. if you could change the stack
  
  pages to
  
 copy-on-write, it may even not be too expensive.
 
 but this is the real problem:
 
 ```
 
 |__/   |  x86_64-linux-gnu
 
 julia @time for i=1:10^4 backtrace() end
 elapsed time: 2.789268693 seconds (3200320016 bytes allocated,
  
  89.29%
  
gc

 time)
 ```
 
 ```
 
 |__/   |  x86_64-apple-darwin14.0.0
 
 julia @time for i=1:10^4 backtrace() end
 elapsed time: 2.586410216 seconds (640048 bytes allocated,
  
  89.96%
  
gc

 time)
 ```
 
 ```
 jameson@julia:~/julia-win32$ ./usr/bin/julia.exe -E  @time for

i=1:10^3

 backtrace() end 
 fixme:winsock:WS_EnterSingleProtocolW unknown Protocol
  
  0x
  
 fixme:winsock:WS_EnterSingleProtocolW unknown Protocol
  
  0x
  
 err:dbghelp_stabs:stabs_parse Unknown stab type 0x0a
 elapsed time: 22.6314386 seconds (320032016 bytes allocated, 1.51%
  
  gc
  
time)

 ```
 
 ```
 
 |__/   |  i686-w64-mingw32
 
 julia @time for i=1:10^4 backtrace() end
 elapsed time: 

Re: [julia-users] Re: Simple Finite Difference Methods

2014-12-03 Thread Jameson Nash
Yes, probably (I thought we already had). Someone would need to do some
comparison work though first.
On Wed, Dec 3, 2014 at 10:57 AM Tim Holy tim.h...@gmail.com wrote:

 That's a pretty serious bummer. I can't believe anybody puts up with this.

 Should we change the default initialization
 https://github.com/JuliaLang/julia/blob/b1c99af9bdeef22e0999b283885977
 57541e2cc7/base/profile.jl#L44
 so that, on Windows, it fires every 100ms or so? And add a note to the
 Profile
 docs?

 --Tim

 On Wednesday, December 03, 2014 03:37:59 PM Jameson Nash wrote:
  The suggestion apparently is to use Event Tracing for Windows (aka
  ptrace/dtrace). Yes, that is faster (on linux too...), but misses the
 point
  entirely of profiling user code.
 
  the other offerings typically wrap StalkWalk64 (as we do), and complain
  about how absurdly slow it is
 
  we used to use RtlCaptureStackBackTrace, but it often failed to give
 useful
  backtraces. I think it depends too heavily of walking the EBP pointer
 chain
  (which typically doesn't exist on x86_64). As it happens, the remaining
  suggestions fall into the category of well, obviously you should just
  write your own EBP (32-bit base pointer register) pointer chain walk
  algorithm. here, I'll even write part of it for you ... which would be
  very helpful, if RBP (64-bit base pointer register) was used to make
 stack
  frame chains (hint: it isn't). and these days, the EBP isn't used to make
  stack pointer chains on x86 either.
 
  llvm3.5 contains the ability to interpret COFF files, so you could
  presumably write your own stack-walk algorithm. i don't recommend it,
  however. you might have to call out to StalkWalk anyways to access the
  microsoft symbol server (yes, off their network servers) to complete the
  stalk walk symbol lookup correctly
 
  On Wed Dec 03 2014 at 10:04:19 AM Tim Holy tim.h...@gmail.com wrote:
   Some potentially-interesting links (of which I understand very little):
   http://stackoverflow.com/questions/860602/recommended-open- 
 source-profilers#comment2363112_1137133
   http://stackoverflow.com/questions/8406175/optimizing-stack- 
 walking-performance
  
   I can tell from this comment:
   https://github.com/JuliaLang/julia/issues/2597#issuecomment-15159868
   that you already know about this (and its negatives):
   http://www.lenholgate.com/blog/2008/09/alternative-call-stac
   k-capturing.html
  
   --Tim
  
   On Wednesday, December 03, 2014 02:25:22 PM Jameson Nash wrote:
this stack overflow question indicates that there are two options (
http://stackoverflow.com/questions/153559/what-are-some- 
   good-profilers-for-n
  
ative-c-on-windows )
   
https://software.intel.com/sites/default/files/managed/cd/
  
   92/Intel-VTune-Amp
  
lifierXE-2015-Product-Brief-072914.pdf ($900)
http://www.glowcode.com/summary.htm ($500)
   
   
On Wed Dec 03 2014 at 9:11:28 AM Stefan Karpinski 
 ste...@karpinski.org
   
wrote:
 This seems nuts. There have to be good profilers on Windows – how
 do
  
   those
  
 work?

 On Tue, Dec 2, 2014 at 11:55 PM, Jameson Nash vtjn...@gmail.com
  
   wrote:
 (I forgot to mention, that, to be fair, the windows machine that
 was
  
   used
  
 to run this test was an underpowered dual-core hyperthreaded atom
 processor, whereas the linux and mac machines were pretty
 comparable
  
   Xeon
  
 and sandybridge machines, respectively. I only gave windows a
 factor
  
   of 2
  
 advantage in the above computation in my accounting for this gap)

 On Tue Dec 02 2014 at 10:50:20 PM Tim Holy tim.h...@gmail.com
  
   wrote:
 Wow, those are pathetically-slow backtraces. Since most of us
 don't
  
   have
  
 machines with 500 cores, I don't see anything we can do.

 --Tim

 On Wednesday, December 03, 2014 03:14:02 AM Jameson Nash wrote:
  you could copy the whole stack (typically only a few 100kb,
 max of

 8MB),

  then do the stack walk offline. if you could change the stack
  
   pages to
  
  copy-on-write, it may even not be too expensive.
 
  but this is the real problem:
 
  ```
 
  |__/   |  x86_64-linux-gnu
 
  julia @time for i=1:10^4 backtrace() end
  elapsed time: 2.789268693 seconds (3200320016 bytes allocated,
  
   89.29%
  
 gc

  time)
  ```
 
  ```
 
  |__/   |  x86_64-apple-darwin14.0.0
 
  julia @time for i=1:10^4 backtrace() end
  elapsed time: 2.586410216 seconds (640048 bytes allocated,
  
   89.96%
  
 gc

  time)
  ```
 
  ```
  jameson@julia:~/julia-win32$ ./usr/bin/julia.exe -E  @time
 for

 i=1:10^3

  backtrace() end 
  fixme:winsock:WS_EnterSingleProtocolW unknown Protocol
  
   0x
  
  fixme:winsock:WS_EnterSingleProtocolW unknown Protocol
  
   0x
  
  err:dbghelp_stabs:stabs_parse 

Re: [julia-users] Re: Simple Finite Difference Methods

2014-12-03 Thread Tim Holy
Can somebody on a Windows system report back with the output of 
`Profile.init()`?

--Tim

On Wednesday, December 03, 2014 04:38:01 PM Jameson Nash wrote:
 Yes, probably (I thought we already had). Someone would need to do some
 comparison work though first.
 
 On Wed, Dec 3, 2014 at 10:57 AM Tim Holy tim.h...@gmail.com wrote:
  That's a pretty serious bummer. I can't believe anybody puts up with this.
  
  Should we change the default initialization
  https://github.com/JuliaLang/julia/blob/b1c99af9bdeef22e0999b283885977
  57541e2cc7/base/profile.jl#L44
  so that, on Windows, it fires every 100ms or so? And add a note to the
  Profile
  docs?
  
  --Tim
  
  On Wednesday, December 03, 2014 03:37:59 PM Jameson Nash wrote:
   The suggestion apparently is to use Event Tracing for Windows (aka
   ptrace/dtrace). Yes, that is faster (on linux too...), but misses the
  
  point
  
   entirely of profiling user code.
   
   the other offerings typically wrap StalkWalk64 (as we do), and complain
   about how absurdly slow it is
   
   we used to use RtlCaptureStackBackTrace, but it often failed to give
  
  useful
  
   backtraces. I think it depends too heavily of walking the EBP pointer
  
  chain
  
   (which typically doesn't exist on x86_64). As it happens, the remaining
   suggestions fall into the category of well, obviously you should just
   write your own EBP (32-bit base pointer register) pointer chain walk
   algorithm. here, I'll even write part of it for you ... which would be
   very helpful, if RBP (64-bit base pointer register) was used to make
  
  stack
  
   frame chains (hint: it isn't). and these days, the EBP isn't used to
   make
   stack pointer chains on x86 either.
   
   llvm3.5 contains the ability to interpret COFF files, so you could
   presumably write your own stack-walk algorithm. i don't recommend it,
   however. you might have to call out to StalkWalk anyways to access the
   microsoft symbol server (yes, off their network servers) to complete the
   stalk walk symbol lookup correctly
   
   On Wed Dec 03 2014 at 10:04:19 AM Tim Holy tim.h...@gmail.com wrote:
Some potentially-interesting links (of which I understand very
little):
http://stackoverflow.com/questions/860602/recommended-open- 
  
  source-profilers#comment2363112_1137133
  
http://stackoverflow.com/questions/8406175/optimizing-stack- 
  
  walking-performance
  
I can tell from this comment:
https://github.com/JuliaLang/julia/issues/2597#issuecomment-15159868
that you already know about this (and its negatives):
http://www.lenholgate.com/blog/2008/09/alternative-call-stac
k-capturing.html

--Tim

On Wednesday, December 03, 2014 02:25:22 PM Jameson Nash wrote:
 this stack overflow question indicates that there are two options (
 http://stackoverflow.com/questions/153559/what-are-some- 

good-profilers-for-n

 ative-c-on-windows )
 
 https://software.intel.com/sites/default/files/managed/cd/

92/Intel-VTune-Amp

 lifierXE-2015-Product-Brief-072914.pdf ($900)
 http://www.glowcode.com/summary.htm ($500)
 
 
 On Wed Dec 03 2014 at 9:11:28 AM Stefan Karpinski 
  
  ste...@karpinski.org
  
 wrote:
  This seems nuts. There have to be good profilers on Windows – how
  
  do
  
those

  work?
  
  On Tue, Dec 2, 2014 at 11:55 PM, Jameson Nash vtjn...@gmail.com

wrote:
  (I forgot to mention, that, to be fair, the windows machine that
  
  was
  
used

  to run this test was an underpowered dual-core hyperthreaded atom
  processor, whereas the linux and mac machines were pretty
  
  comparable
  
Xeon

  and sandybridge machines, respectively. I only gave windows a
  
  factor
  
of 2

  advantage in the above computation in my accounting for this gap)
  
  On Tue Dec 02 2014 at 10:50:20 PM Tim Holy tim.h...@gmail.com

wrote:
  Wow, those are pathetically-slow backtraces. Since most of us
  
  don't
  
have

  machines with 500 cores, I don't see anything we can do.
  
  --Tim
  
  On Wednesday, December 03, 2014 03:14:02 AM Jameson Nash wrote:
   you could copy the whole stack (typically only a few 100kb,
  
  max of
  
  8MB),
  
   then do the stack walk offline. if you could change the stack

pages to

   copy-on-write, it may even not be too expensive.
   
   but this is the real problem:
   
   ```
   
   |__/   |  x86_64-linux-gnu
   
   julia @time for i=1:10^4 backtrace() end
   elapsed time: 2.789268693 seconds (3200320016 bytes allocated,

89.29%

  gc
  
   time)
   ```
   
   ```
   
   |__/   |  x86_64-apple-darwin14.0.0
   
   julia @time for i=1:10^4 backtrace() end
   elapsed time: 2.586410216 seconds (640048 

Re: [julia-users] Re: Simple Finite Difference Methods

2014-12-03 Thread Daniel Høegh
   _
   _   _ _(_)_ |  A fresh approach to technical computing
  (_) | (_) (_)|  Documentation: http://docs.julialang.org
   _ _   _| |_  __ _   |  Type help() for help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 0.3.3 (2014-11-23 20:19 UTC)
 _/ |\__'_|_|_|\__'_|  |
|__/   |  x86_64-w64-mingw32

julia Profile.init()
(100,0.001)

julia

Re: [julia-users] Re: Simple Finite Difference Methods

2014-12-03 Thread Isaiah Norton

  The accuracy of windows timers is somewhat questionable.


I don't know if 5% is good enough for this purpose, but: one of our
collaborators uses (a lightly modified version of) the code below in a
real-time imaging application:

http://www.codeguru.com/cpp/w-p/system/timers/article.php/c5759/Creating-a-HighPrecision-HighResolution-and-Highly-Reliable-Timer-Utilising-Minimal-CPU-Resources.htm

On Tue, Dec 2, 2014 at 1:20 PM, Jameson Nash vtjn...@gmail.com wrote:

 Correct. Windows imposes a much higher overhead on just about every aspect
 of doing profiling. Unfortunately, there isn't much we can do about this,
 other then to complain to Microsoft. (It doesn't have signals, so we must
 emulate them with a separate thread. The accuracy of windows timers is
 somewhat questionable. And the stack walk library (for recording the
 backtrace) is apparently just badly written and therefore insanely slow and
 memory hungry.)

 On Tue, Dec 2, 2014 at 12:59 PM Tim Holy tim.h...@gmail.com wrote:

 I think it's just that Windows is bad at scheduling tasks with
 short-latency,
 high-precision timing, but I am not the right person to answer such
 questions.

 --Tim

 On Tuesday, December 02, 2014 09:57:28 AM Peter Simon wrote:
  I have also experienced the inaccurate profile timings on Windows.  Is
 the
  reason for the bad profiler performance on Windows understood?  Are
 there
  plans for improvement?
 
  Thanks,
  --Peter
 
  On Tuesday, December 2, 2014 3:57:16 AM UTC-8, Tim Holy wrote:
   By default, the profiler takes one sample per millisecond. In
 practice,
   the
   timing is quite precise on Linux, seemingly within a factor of twoish
 on
   OSX,
   and nowhere close on Windows. So at least on Linux you can simply read
   samples
   as milliseconds.
  
   If you want to visualize the relative contributions of each
 statement, I
   highly recommend ProfileView. If you use LightTable, it's already
 built-in
   via
   the profile() command. The combination of ProfileView and @profile
 is, in
   my
   (extremely biased) opinion, quite powerful compared to tools I used
   previously
   in other programming environments.
  
   Finally, there's IProfile.jl, which works via a completely different
   mechanism
   but does report raw timings (with some pretty big caveats).
  
   Best,
   --Tim
  
   On Monday, December 01, 2014 10:13:16 PM Christoph Ortner wrote:
How do you get timings from the Julia profiler, or even better,
 %-es? I
guess one can convert from the numbers one gets, but it is a bit
  
   painful?
  
Christoph




Re: [julia-users] Re: Simple Finite Difference Methods

2014-12-03 Thread Tim Holy
Thanks. Looks like that needs to be changed.

--Tim

On Wednesday, December 03, 2014 08:53:50 AM Daniel Høegh wrote:
_
_   _ _(_)_ |  A fresh approach to technical computing
   (_) | (_) (_)|  Documentation: http://docs.julialang.org
_ _   _| |_  __ _   |  Type help() for help.
 
   | | | | | | |/ _` |  |
   | | |
   | | |_| | | | (_| |  |  Version 0.3.3 (2014-11-23 20:19 UTC)
 
  _/ |\__'_|_|_|\__'_|  |
 
 |__/   |  x86_64-w64-mingw32
 
 julia Profile.init()
 (100,0.001)
 
 julia



Re: [julia-users] Re: Simple Finite Difference Methods

2014-12-02 Thread Tim Holy
By default, the profiler takes one sample per millisecond. In practice, the 
timing is quite precise on Linux, seemingly within a factor of twoish on OSX, 
and nowhere close on Windows. So at least on Linux you can simply read samples 
as milliseconds.

If you want to visualize the relative contributions of each statement, I 
highly recommend ProfileView. If you use LightTable, it's already built-in via 
the profile() command. The combination of ProfileView and @profile is, in my 
(extremely biased) opinion, quite powerful compared to tools I used previously 
in other programming environments.

Finally, there's IProfile.jl, which works via a completely different mechanism 
but does report raw timings (with some pretty big caveats).

Best,
--Tim

On Monday, December 01, 2014 10:13:16 PM Christoph Ortner wrote:
 How do you get timings from the Julia profiler, or even better, %-es? I
 guess one can convert from the numbers one gets, but it is a bit painful?
 
 Christoph



Re: [julia-users] Re: Simple Finite Difference Methods

2014-12-02 Thread Valentin Churavy
Hi Christoph,

If you pass in a function via an argument the compiler get relatively 
little information about that function and can not do any optimization on 
it. That is part of the reason why map(f, xs) is currently slower than a 
for-loop. There are ongoing discussions on how to solve that problem as an 
example see https://github.com/JuliaLang/julia/issues/8450

For profiling I used LightTable's ProfileView support and I have to agree 
with Tim that it is pretty amazing.

Valentin

On Tuesday, 2 December 2014 07:04:39 UTC+1, Christoph Ortner wrote:




 I took a look at the Lennard-Jones example. On top of the memory 
 allocation from array slicing, there is also a lot of overhead in passing 
 lj(r) as an input function phi. 


 this is counterintuitive to me. Why is this a problem?

  Christoph 



Re: [julia-users] Re: Simple Finite Difference Methods

2014-12-02 Thread Christoph Ortner
Dear Tim and Valentin,

Thanks for the feedback - It sounds like I need to give Lighttable another 
try. So far I even have had difficulties setting it up to work properly 
with Julia.

But I will definitely try to play more with ProfileView in Julia.

   Christoph



Re: [julia-users] Re: Simple Finite Difference Methods

2014-12-02 Thread Peter Simon
I have also experienced the inaccurate profile timings on Windows.  Is the 
reason for the bad profiler performance on Windows understood?  Are there 
plans for improvement?  

Thanks,
--Peter

On Tuesday, December 2, 2014 3:57:16 AM UTC-8, Tim Holy wrote:

 By default, the profiler takes one sample per millisecond. In practice, 
 the 
 timing is quite precise on Linux, seemingly within a factor of twoish on 
 OSX, 
 and nowhere close on Windows. So at least on Linux you can simply read 
 samples 
 as milliseconds. 

 If you want to visualize the relative contributions of each statement, I 
 highly recommend ProfileView. If you use LightTable, it's already built-in 
 via 
 the profile() command. The combination of ProfileView and @profile is, in 
 my 
 (extremely biased) opinion, quite powerful compared to tools I used 
 previously 
 in other programming environments. 

 Finally, there's IProfile.jl, which works via a completely different 
 mechanism 
 but does report raw timings (with some pretty big caveats). 

 Best, 
 --Tim 

 On Monday, December 01, 2014 10:13:16 PM Christoph Ortner wrote: 
  How do you get timings from the Julia profiler, or even better, %-es? I 
  guess one can convert from the numbers one gets, but it is a bit 
 painful? 
  
  Christoph 



Re: [julia-users] Re: Simple Finite Difference Methods

2014-12-02 Thread Tim Holy
I think it's just that Windows is bad at scheduling tasks with short-latency, 
high-precision timing, but I am not the right person to answer such questions.

--Tim

On Tuesday, December 02, 2014 09:57:28 AM Peter Simon wrote:
 I have also experienced the inaccurate profile timings on Windows.  Is the
 reason for the bad profiler performance on Windows understood?  Are there
 plans for improvement?
 
 Thanks,
 --Peter
 
 On Tuesday, December 2, 2014 3:57:16 AM UTC-8, Tim Holy wrote:
  By default, the profiler takes one sample per millisecond. In practice,
  the
  timing is quite precise on Linux, seemingly within a factor of twoish on
  OSX,
  and nowhere close on Windows. So at least on Linux you can simply read
  samples
  as milliseconds.
  
  If you want to visualize the relative contributions of each statement, I
  highly recommend ProfileView. If you use LightTable, it's already built-in
  via
  the profile() command. The combination of ProfileView and @profile is, in
  my
  (extremely biased) opinion, quite powerful compared to tools I used
  previously
  in other programming environments.
  
  Finally, there's IProfile.jl, which works via a completely different
  mechanism
  but does report raw timings (with some pretty big caveats).
  
  Best,
  --Tim
  
  On Monday, December 01, 2014 10:13:16 PM Christoph Ortner wrote:
   How do you get timings from the Julia profiler, or even better, %-es? I
   guess one can convert from the numbers one gets, but it is a bit
  
  painful?
  
   Christoph



Re: [julia-users] Re: Simple Finite Difference Methods

2014-12-02 Thread Jiahao Chen
 Thank you for these ideas. I particularly like how to use unicode - I may
need to learn how to do this.

The manual has some sections on Unicode input using LaTeX-like tab
completion syntax.

 Having to manually write out all these loops is a bit of a pain. Is
@devec not doing the job?

I haven't really tried.

 You seem to say that I shouldn't use the return keyword - I like for
readability. But is there a Julia style guide I should start following?

All I mean is that it's not strictly required if all you do is return the
value of the last statement. We don't really have a style guide.

Thanks,

Jiahao Chen
Staff Research Scientist
MIT Computer Science and Artificial Intelligence Laboratory


Re: [julia-users] Re: Simple Finite Difference Methods

2014-12-02 Thread Jameson Nash
Correct. Windows imposes a much higher overhead on just about every aspect
of doing profiling. Unfortunately, there isn't much we can do about this,
other then to complain to Microsoft. (It doesn't have signals, so we must
emulate them with a separate thread. The accuracy of windows timers is
somewhat questionable. And the stack walk library (for recording the
backtrace) is apparently just badly written and therefore insanely slow and
memory hungry.)
On Tue, Dec 2, 2014 at 12:59 PM Tim Holy tim.h...@gmail.com wrote:

 I think it's just that Windows is bad at scheduling tasks with
 short-latency,
 high-precision timing, but I am not the right person to answer such
 questions.

 --Tim

 On Tuesday, December 02, 2014 09:57:28 AM Peter Simon wrote:
  I have also experienced the inaccurate profile timings on Windows.  Is
 the
  reason for the bad profiler performance on Windows understood?  Are there
  plans for improvement?
 
  Thanks,
  --Peter
 
  On Tuesday, December 2, 2014 3:57:16 AM UTC-8, Tim Holy wrote:
   By default, the profiler takes one sample per millisecond. In practice,
   the
   timing is quite precise on Linux, seemingly within a factor of twoish
 on
   OSX,
   and nowhere close on Windows. So at least on Linux you can simply read
   samples
   as milliseconds.
  
   If you want to visualize the relative contributions of each statement,
 I
   highly recommend ProfileView. If you use LightTable, it's already
 built-in
   via
   the profile() command. The combination of ProfileView and @profile is,
 in
   my
   (extremely biased) opinion, quite powerful compared to tools I used
   previously
   in other programming environments.
  
   Finally, there's IProfile.jl, which works via a completely different
   mechanism
   but does report raw timings (with some pretty big caveats).
  
   Best,
   --Tim
  
   On Monday, December 01, 2014 10:13:16 PM Christoph Ortner wrote:
How do you get timings from the Julia profiler, or even better,
 %-es? I
guess one can convert from the numbers one gets, but it is a bit
  
   painful?
  
Christoph




Re: [julia-users] Re: Simple Finite Difference Methods

2014-12-02 Thread Jameson Nash
Although, over thanksgiving, I pushed a number of fixes which should
improve the quality of backtraces on win32 (and make sys.dll usable there)
On Tue, Dec 2, 2014 at 1:20 PM Jameson Nash vtjn...@gmail.com wrote:

 Correct. Windows imposes a much higher overhead on just about every aspect
 of doing profiling. Unfortunately, there isn't much we can do about this,
 other then to complain to Microsoft. (It doesn't have signals, so we must
 emulate them with a separate thread. The accuracy of windows timers is
 somewhat questionable. And the stack walk library (for recording the
 backtrace) is apparently just badly written and therefore insanely slow and
 memory hungry.)
 On Tue, Dec 2, 2014 at 12:59 PM Tim Holy tim.h...@gmail.com wrote:

 I think it's just that Windows is bad at scheduling tasks with
 short-latency,
 high-precision timing, but I am not the right person to answer such
 questions.

 --Tim

 On Tuesday, December 02, 2014 09:57:28 AM Peter Simon wrote:
  I have also experienced the inaccurate profile timings on Windows.  Is
 the
  reason for the bad profiler performance on Windows understood?  Are
 there
  plans for improvement?
 
  Thanks,
  --Peter
 
  On Tuesday, December 2, 2014 3:57:16 AM UTC-8, Tim Holy wrote:
   By default, the profiler takes one sample per millisecond. In
 practice,
   the
   timing is quite precise on Linux, seemingly within a factor of twoish
 on
   OSX,
   and nowhere close on Windows. So at least on Linux you can simply read
   samples
   as milliseconds.
  
   If you want to visualize the relative contributions of each
 statement, I
   highly recommend ProfileView. If you use LightTable, it's already
 built-in
   via
   the profile() command. The combination of ProfileView and @profile
 is, in
   my
   (extremely biased) opinion, quite powerful compared to tools I used
   previously
   in other programming environments.
  
   Finally, there's IProfile.jl, which works via a completely different
   mechanism
   but does report raw timings (with some pretty big caveats).
  
   Best,
   --Tim
  
   On Monday, December 01, 2014 10:13:16 PM Christoph Ortner wrote:
How do you get timings from the Julia profiler, or even better,
 %-es? I
guess one can convert from the numbers one gets, but it is a bit
  
   painful?
  
Christoph




Re: [julia-users] Re: Simple Finite Difference Methods

2014-12-02 Thread Tim Holy
On Windows, is there any chance that one could set up a separate thread for 
profiling and use busy-wait to do the timing? (I don't even know whether one 
thread can snoop on the execution state of another thread.)

--Tim

On Tuesday, December 02, 2014 06:22:39 PM Jameson Nash wrote:
 Although, over thanksgiving, I pushed a number of fixes which should
 improve the quality of backtraces on win32 (and make sys.dll usable there)
 
 On Tue, Dec 2, 2014 at 1:20 PM Jameson Nash vtjn...@gmail.com wrote:
  Correct. Windows imposes a much higher overhead on just about every aspect
  of doing profiling. Unfortunately, there isn't much we can do about this,
  other then to complain to Microsoft. (It doesn't have signals, so we must
  emulate them with a separate thread. The accuracy of windows timers is
  somewhat questionable. And the stack walk library (for recording the
  backtrace) is apparently just badly written and therefore insanely slow
  and
  memory hungry.)
  
  On Tue, Dec 2, 2014 at 12:59 PM Tim Holy tim.h...@gmail.com wrote:
  I think it's just that Windows is bad at scheduling tasks with
  short-latency,
  high-precision timing, but I am not the right person to answer such
  questions.
  
  --Tim
  
  On Tuesday, December 02, 2014 09:57:28 AM Peter Simon wrote:
   I have also experienced the inaccurate profile timings on Windows.  Is
  
  the
  
   reason for the bad profiler performance on Windows understood?  Are
  
  there
  
   plans for improvement?
   
   Thanks,
   --Peter
   
   On Tuesday, December 2, 2014 3:57:16 AM UTC-8, Tim Holy wrote:
By default, the profiler takes one sample per millisecond. In
  
  practice,
  
the
timing is quite precise on Linux, seemingly within a factor of twoish
  
  on
  
OSX,
and nowhere close on Windows. So at least on Linux you can simply
read
samples
as milliseconds.

If you want to visualize the relative contributions of each
  
  statement, I
  
highly recommend ProfileView. If you use LightTable, it's already
  
  built-in
  
via
the profile() command. The combination of ProfileView and @profile
  
  is, in
  
my
(extremely biased) opinion, quite powerful compared to tools I used
previously
in other programming environments.

Finally, there's IProfile.jl, which works via a completely different
mechanism
but does report raw timings (with some pretty big caveats).

Best,
--Tim

On Monday, December 01, 2014 10:13:16 PM Christoph Ortner wrote:
 How do you get timings from the Julia profiler, or even better,
  
  %-es? I
  
 guess one can convert from the numbers one gets, but it is a bit

painful?

 Christoph



Re: [julia-users] Re: Simple Finite Difference Methods

2014-12-02 Thread Jameson Nash
That's essentially what we do now. (Minus the busy wait part). The overhead
is too high to run it any more frequently -- it already causes a
significant performance penalty on the system, even at the much lower
sample rate than linux. However, I suspect the truncated backtraces on
win32 were exaggerating the effect somewhat -- that should not be as much
of an issue now.

Sure, windows lets you snoop on (and modify) the address space of any
process, you just need to find the right handle.
On Tue, Dec 2, 2014 at 2:18 PM Tim Holy tim.h...@gmail.com wrote:

 On Windows, is there any chance that one could set up a separate thread for
 profiling and use busy-wait to do the timing? (I don't even know whether
 one
 thread can snoop on the execution state of another thread.)

 --Tim

 On Tuesday, December 02, 2014 06:22:39 PM Jameson Nash wrote:
  Although, over thanksgiving, I pushed a number of fixes which should
  improve the quality of backtraces on win32 (and make sys.dll usable
 there)
 
  On Tue, Dec 2, 2014 at 1:20 PM Jameson Nash vtjn...@gmail.com wrote:
   Correct. Windows imposes a much higher overhead on just about every
 aspect
   of doing profiling. Unfortunately, there isn't much we can do about
 this,
   other then to complain to Microsoft. (It doesn't have signals, so we
 must
   emulate them with a separate thread. The accuracy of windows timers is
   somewhat questionable. And the stack walk library (for recording the
   backtrace) is apparently just badly written and therefore insanely slow
   and
   memory hungry.)
  
   On Tue, Dec 2, 2014 at 12:59 PM Tim Holy tim.h...@gmail.com wrote:
   I think it's just that Windows is bad at scheduling tasks with
   short-latency,
   high-precision timing, but I am not the right person to answer such
   questions.
  
   --Tim
  
   On Tuesday, December 02, 2014 09:57:28 AM Peter Simon wrote:
I have also experienced the inaccurate profile timings on Windows.
 Is
  
   the
  
reason for the bad profiler performance on Windows understood?  Are
  
   there
  
plans for improvement?
   
Thanks,
--Peter
   
On Tuesday, December 2, 2014 3:57:16 AM UTC-8, Tim Holy wrote:
 By default, the profiler takes one sample per millisecond. In
  
   practice,
  
 the
 timing is quite precise on Linux, seemingly within a factor of
 twoish
  
   on
  
 OSX,
 and nowhere close on Windows. So at least on Linux you can simply
 read
 samples
 as milliseconds.

 If you want to visualize the relative contributions of each
  
   statement, I
  
 highly recommend ProfileView. If you use LightTable, it's already
  
   built-in
  
 via
 the profile() command. The combination of ProfileView and @profile
  
   is, in
  
 my
 (extremely biased) opinion, quite powerful compared to tools I
 used
 previously
 in other programming environments.

 Finally, there's IProfile.jl, which works via a completely
 different
 mechanism
 but does report raw timings (with some pretty big caveats).

 Best,
 --Tim

 On Monday, December 01, 2014 10:13:16 PM Christoph Ortner wrote:
  How do you get timings from the Julia profiler, or even better,
  
   %-es? I
  
  guess one can convert from the numbers one gets, but it is a bit

 painful?

  Christoph




Re: [julia-users] Re: Simple Finite Difference Methods

2014-12-02 Thread Tim Holy
If the work of walking the stack is done in the thread, why does it cause any 
slowdown of the main process?

But of course the time it takes to complete the backtrace sets an upper limit 
on how frequently you can take a snapshot. In that case, though, couldn't you 
just have the thread always collecting backtraces?

--Tim

On Tuesday, December 02, 2014 09:54:17 PM Jameson Nash wrote:
 That's essentially what we do now. (Minus the busy wait part). The overhead
 is too high to run it any more frequently -- it already causes a
 significant performance penalty on the system, even at the much lower
 sample rate than linux. However, I suspect the truncated backtraces on
 win32 were exaggerating the effect somewhat -- that should not be as much
 of an issue now.
 
 Sure, windows lets you snoop on (and modify) the address space of any
 process, you just need to find the right handle.
 
 On Tue, Dec 2, 2014 at 2:18 PM Tim Holy tim.h...@gmail.com wrote:
  On Windows, is there any chance that one could set up a separate thread
  for
  profiling and use busy-wait to do the timing? (I don't even know whether
  one
  thread can snoop on the execution state of another thread.)
  
  --Tim
  
  On Tuesday, December 02, 2014 06:22:39 PM Jameson Nash wrote:
   Although, over thanksgiving, I pushed a number of fixes which should
   improve the quality of backtraces on win32 (and make sys.dll usable
  
  there)
  
   On Tue, Dec 2, 2014 at 1:20 PM Jameson Nash vtjn...@gmail.com wrote:
Correct. Windows imposes a much higher overhead on just about every
  
  aspect
  
of doing profiling. Unfortunately, there isn't much we can do about
  
  this,
  
other then to complain to Microsoft. (It doesn't have signals, so we
  
  must
  
emulate them with a separate thread. The accuracy of windows timers is
somewhat questionable. And the stack walk library (for recording the
backtrace) is apparently just badly written and therefore insanely
slow
and
memory hungry.)

On Tue, Dec 2, 2014 at 12:59 PM Tim Holy tim.h...@gmail.com wrote:
I think it's just that Windows is bad at scheduling tasks with
short-latency,
high-precision timing, but I am not the right person to answer such
questions.

--Tim

On Tuesday, December 02, 2014 09:57:28 AM Peter Simon wrote:
 I have also experienced the inaccurate profile timings on Windows.
  
  Is
  
the

 reason for the bad profiler performance on Windows understood?  Are

there

 plans for improvement?
 
 Thanks,
 --Peter
 
 On Tuesday, December 2, 2014 3:57:16 AM UTC-8, Tim Holy wrote:
  By default, the profiler takes one sample per millisecond. In

practice,

  the
  timing is quite precise on Linux, seemingly within a factor of
  
  twoish
  
on

  OSX,
  and nowhere close on Windows. So at least on Linux you can simply
  read
  samples
  as milliseconds.
  
  If you want to visualize the relative contributions of each

statement, I

  highly recommend ProfileView. If you use LightTable, it's already

built-in

  via
  the profile() command. The combination of ProfileView and
  @profile

is, in

  my
  (extremely biased) opinion, quite powerful compared to tools I
  
  used
  
  previously
  in other programming environments.
  
  Finally, there's IProfile.jl, which works via a completely
  
  different
  
  mechanism
  but does report raw timings (with some pretty big caveats).
  
  Best,
  --Tim
  
  On Monday, December 01, 2014 10:13:16 PM Christoph Ortner wrote:
   How do you get timings from the Julia profiler, or even better,

%-es? I

   guess one can convert from the numbers one gets, but it is a
   bit
  
  painful?
  
   Christoph



Re: [julia-users] Re: Simple Finite Difference Methods

2014-12-02 Thread Jameson Nash
You can't profile a moving target. The thread must be frozen first to
ensure the stack trace doesn't change while attempting to record it
On Tue, Dec 2, 2014 at 5:12 PM Tim Holy tim.h...@gmail.com wrote:

 If the work of walking the stack is done in the thread, why does it cause
 any
 slowdown of the main process?

 But of course the time it takes to complete the backtrace sets an upper
 limit
 on how frequently you can take a snapshot. In that case, though, couldn't
 you
 just have the thread always collecting backtraces?

 --Tim

 On Tuesday, December 02, 2014 09:54:17 PM Jameson Nash wrote:
  That's essentially what we do now. (Minus the busy wait part). The
 overhead
  is too high to run it any more frequently -- it already causes a
  significant performance penalty on the system, even at the much lower
  sample rate than linux. However, I suspect the truncated backtraces on
  win32 were exaggerating the effect somewhat -- that should not be as much
  of an issue now.
 
  Sure, windows lets you snoop on (and modify) the address space of any
  process, you just need to find the right handle.
 
  On Tue, Dec 2, 2014 at 2:18 PM Tim Holy tim.h...@gmail.com wrote:
   On Windows, is there any chance that one could set up a separate thread
   for
   profiling and use busy-wait to do the timing? (I don't even know
 whether
   one
   thread can snoop on the execution state of another thread.)
  
   --Tim
  
   On Tuesday, December 02, 2014 06:22:39 PM Jameson Nash wrote:
Although, over thanksgiving, I pushed a number of fixes which should
improve the quality of backtraces on win32 (and make sys.dll usable
  
   there)
  
On Tue, Dec 2, 2014 at 1:20 PM Jameson Nash vtjn...@gmail.com
 wrote:
 Correct. Windows imposes a much higher overhead on just about every
  
   aspect
  
 of doing profiling. Unfortunately, there isn't much we can do about
  
   this,
  
 other then to complain to Microsoft. (It doesn't have signals, so
 we
  
   must
  
 emulate them with a separate thread. The accuracy of windows
 timers is
 somewhat questionable. And the stack walk library (for recording
 the
 backtrace) is apparently just badly written and therefore insanely
 slow
 and
 memory hungry.)

 On Tue, Dec 2, 2014 at 12:59 PM Tim Holy tim.h...@gmail.com
 wrote:
 I think it's just that Windows is bad at scheduling tasks with
 short-latency,
 high-precision timing, but I am not the right person to answer
 such
 questions.

 --Tim

 On Tuesday, December 02, 2014 09:57:28 AM Peter Simon wrote:
  I have also experienced the inaccurate profile timings on
 Windows.
  
   Is
  
 the

  reason for the bad profiler performance on Windows understood?
 Are

 there

  plans for improvement?
 
  Thanks,
  --Peter
 
  On Tuesday, December 2, 2014 3:57:16 AM UTC-8, Tim Holy wrote:
   By default, the profiler takes one sample per millisecond. In

 practice,

   the
   timing is quite precise on Linux, seemingly within a factor of
  
   twoish
  
 on

   OSX,
   and nowhere close on Windows. So at least on Linux you can
 simply
   read
   samples
   as milliseconds.
  
   If you want to visualize the relative contributions of each

 statement, I

   highly recommend ProfileView. If you use LightTable, it's
 already

 built-in

   via
   the profile() command. The combination of ProfileView and
   @profile

 is, in

   my
   (extremely biased) opinion, quite powerful compared to tools I
  
   used
  
   previously
   in other programming environments.
  
   Finally, there's IProfile.jl, which works via a completely
  
   different
  
   mechanism
   but does report raw timings (with some pretty big caveats).
  
   Best,
   --Tim
  
   On Monday, December 01, 2014 10:13:16 PM Christoph Ortner
 wrote:
How do you get timings from the Julia profiler, or even
 better,

 %-es? I

guess one can convert from the numbers one gets, but it is a
bit
  
   painful?
  
Christoph




Re: [julia-users] Re: Simple Finite Difference Methods

2014-12-02 Thread Tim Holy
On Tuesday, December 02, 2014 10:24:43 PM Jameson Nash wrote:
 You can't profile a moving target. The thread must be frozen first to
 ensure the stack trace doesn't change while attempting to record it

Got it. I assume there's no good way to make a copy and then discard if the 
copy is bad?

--Tim

 
 On Tue, Dec 2, 2014 at 5:12 PM Tim Holy tim.h...@gmail.com wrote:
  If the work of walking the stack is done in the thread, why does it cause
  any
  slowdown of the main process?
  
  But of course the time it takes to complete the backtrace sets an upper
  limit
  on how frequently you can take a snapshot. In that case, though, couldn't
  you
  just have the thread always collecting backtraces?
  
  --Tim
  
  On Tuesday, December 02, 2014 09:54:17 PM Jameson Nash wrote:
   That's essentially what we do now. (Minus the busy wait part). The
  
  overhead
  
   is too high to run it any more frequently -- it already causes a
   significant performance penalty on the system, even at the much lower
   sample rate than linux. However, I suspect the truncated backtraces on
   win32 were exaggerating the effect somewhat -- that should not be as
   much
   of an issue now.
   
   Sure, windows lets you snoop on (and modify) the address space of any
   process, you just need to find the right handle.
   
   On Tue, Dec 2, 2014 at 2:18 PM Tim Holy tim.h...@gmail.com wrote:
On Windows, is there any chance that one could set up a separate
thread
for
profiling and use busy-wait to do the timing? (I don't even know
  
  whether
  
one
thread can snoop on the execution state of another thread.)

--Tim

On Tuesday, December 02, 2014 06:22:39 PM Jameson Nash wrote:
 Although, over thanksgiving, I pushed a number of fixes which should
 improve the quality of backtraces on win32 (and make sys.dll usable

there)

 On Tue, Dec 2, 2014 at 1:20 PM Jameson Nash vtjn...@gmail.com
  
  wrote:
  Correct. Windows imposes a much higher overhead on just about
  every

aspect

  of doing profiling. Unfortunately, there isn't much we can do
  about

this,

  other then to complain to Microsoft. (It doesn't have signals, so
  
  we
  
must

  emulate them with a separate thread. The accuracy of windows
  
  timers is
  
  somewhat questionable. And the stack walk library (for recording
  
  the
  
  backtrace) is apparently just badly written and therefore insanely
  slow
  and
  memory hungry.)
  
  On Tue, Dec 2, 2014 at 12:59 PM Tim Holy tim.h...@gmail.com
  
  wrote:
  I think it's just that Windows is bad at scheduling tasks with
  short-latency,
  high-precision timing, but I am not the right person to answer
  
  such
  
  questions.
  
  --Tim
  
  On Tuesday, December 02, 2014 09:57:28 AM Peter Simon wrote:
   I have also experienced the inaccurate profile timings on
  
  Windows.
  
Is

  the
  
   reason for the bad profiler performance on Windows understood?
  
  Are
  
  there
  
   plans for improvement?
   
   Thanks,
   --Peter
   
   On Tuesday, December 2, 2014 3:57:16 AM UTC-8, Tim Holy wrote:
By default, the profiler takes one sample per millisecond. In
  
  practice,
  
the
timing is quite precise on Linux, seemingly within a factor
of

twoish

  on
  
OSX,
and nowhere close on Windows. So at least on Linux you can
  
  simply
  
read
samples
as milliseconds.

If you want to visualize the relative contributions of each
  
  statement, I
  
highly recommend ProfileView. If you use LightTable, it's
  
  already
  
  built-in
  
via
the profile() command. The combination of ProfileView and
@profile
  
  is, in
  
my
(extremely biased) opinion, quite powerful compared to tools
I

used

previously
in other programming environments.

Finally, there's IProfile.jl, which works via a completely

different

mechanism
but does report raw timings (with some pretty big caveats).

Best,
--Tim

On Monday, December 01, 2014 10:13:16 PM Christoph Ortner
  
  wrote:
 How do you get timings from the Julia profiler, or even
  
  better,
  
  %-es? I
  
 guess one can convert from the numbers one gets, but it is
 a
 bit

painful?

 Christoph



Re: [julia-users] Re: Simple Finite Difference Methods

2014-12-02 Thread Jameson Nash
you could copy the whole stack (typically only a few 100kb, max of 8MB),
then do the stack walk offline. if you could change the stack pages to
copy-on-write, it may even not be too expensive.

but this is the real problem:

```
|__/   |  x86_64-linux-gnu
julia @time for i=1:10^4 backtrace() end
elapsed time: 2.789268693 seconds (3200320016 bytes allocated, 89.29% gc
time)
```

```
|__/   |  x86_64-apple-darwin14.0.0
julia @time for i=1:10^4 backtrace() end
elapsed time: 2.586410216 seconds (640048 bytes allocated, 89.96% gc
time)
```

```
jameson@julia:~/julia-win32$ ./usr/bin/julia.exe -E  @time for i=1:10^3
backtrace() end 
fixme:winsock:WS_EnterSingleProtocolW unknown Protocol 0x
fixme:winsock:WS_EnterSingleProtocolW unknown Protocol 0x
err:dbghelp_stabs:stabs_parse Unknown stab type 0x0a
elapsed time: 22.6314386 seconds (320032016 bytes allocated, 1.51% gc time)
```

```
|__/   |  i686-w64-mingw32
julia @time for i=1:10^4 backtrace() end
elapsed time: 69.243275608 seconds (3200320800 bytes allocated, 13.16% gc
time)
```

And yes, those gc fractions are verifiably correct. With gc_disable(), they
execute in 1/10 of the time. So, that pretty much means you must take 1/100
of the samples if you want to preserve roughly the same slow down. On
linux, I find the slowdown to be in the range of 2-5x, and consider that to
be pretty reasonable, especially for what you're getting. If you took the
same number of samples on windows, it would cause a 200-500x slowdown (give
or take a few percent). If you wanted to offload this work to other cores
to get the same level of accuracy and no more slowdown than linux, you
would need a machine with 200-500 processors (give or take 2-5)!

(I think I did those conversions correctly. However, since I just did them
for the purposes of this email, sans calculator, and as I was typing, let
me know if I made more than a factor of 2 error somewhere, or just have fun
reading https://what-if.xkcd.com/84/ instead)

On Tue Dec 02 2014 at 6:23:07 PM Tim Holy tim.h...@gmail.com wrote:

 On Tuesday, December 02, 2014 10:24:43 PM Jameson Nash wrote:
  You can't profile a moving target. The thread must be frozen first to
  ensure the stack trace doesn't change while attempting to record it

 Got it. I assume there's no good way to make a copy and then discard if
 the
 copy is bad?

 --Tim

 
  On Tue, Dec 2, 2014 at 5:12 PM Tim Holy tim.h...@gmail.com wrote:
   If the work of walking the stack is done in the thread, why does it
 cause
   any
   slowdown of the main process?
  
   But of course the time it takes to complete the backtrace sets an upper
   limit
   on how frequently you can take a snapshot. In that case, though,
 couldn't
   you
   just have the thread always collecting backtraces?
  
   --Tim
  
   On Tuesday, December 02, 2014 09:54:17 PM Jameson Nash wrote:
That's essentially what we do now. (Minus the busy wait part). The
  
   overhead
  
is too high to run it any more frequently -- it already causes a
significant performance penalty on the system, even at the much lower
sample rate than linux. However, I suspect the truncated backtraces
 on
win32 were exaggerating the effect somewhat -- that should not be as
much
of an issue now.
   
Sure, windows lets you snoop on (and modify) the address space of any
process, you just need to find the right handle.
   
On Tue, Dec 2, 2014 at 2:18 PM Tim Holy tim.h...@gmail.com wrote:
 On Windows, is there any chance that one could set up a separate
 thread
 for
 profiling and use busy-wait to do the timing? (I don't even know
  
   whether
  
 one
 thread can snoop on the execution state of another thread.)

 --Tim

 On Tuesday, December 02, 2014 06:22:39 PM Jameson Nash wrote:
  Although, over thanksgiving, I pushed a number of fixes which
 should
  improve the quality of backtraces on win32 (and make sys.dll
 usable

 there)

  On Tue, Dec 2, 2014 at 1:20 PM Jameson Nash vtjn...@gmail.com
  
   wrote:
   Correct. Windows imposes a much higher overhead on just about
   every

 aspect

   of doing profiling. Unfortunately, there isn't much we can do
   about

 this,

   other then to complain to Microsoft. (It doesn't have signals,
 so
  
   we
  
 must

   emulate them with a separate thread. The accuracy of windows
  
   timers is
  
   somewhat questionable. And the stack walk library (for
 recording
  
   the
  
   backtrace) is apparently just badly written and therefore
 insanely
   slow
   and
   memory hungry.)
  
   On Tue, Dec 2, 2014 at 12:59 PM Tim Holy tim.h...@gmail.com
  
   wrote:
   I think it's just that Windows is bad at scheduling tasks with
   short-latency,
   high-precision timing, but I am not the right person to answer
  
   such
  
   

Re: [julia-users] Re: Simple Finite Difference Methods

2014-12-02 Thread Tim Holy
Wow, those are pathetically-slow backtraces. Since most of us don't have 
machines with 500 cores, I don't see anything we can do.

--Tim

On Wednesday, December 03, 2014 03:14:02 AM Jameson Nash wrote:
 you could copy the whole stack (typically only a few 100kb, max of 8MB),
 then do the stack walk offline. if you could change the stack pages to
 copy-on-write, it may even not be too expensive.
 
 but this is the real problem:
 
 ```
 
 |__/   |  x86_64-linux-gnu
 
 julia @time for i=1:10^4 backtrace() end
 elapsed time: 2.789268693 seconds (3200320016 bytes allocated, 89.29% gc
 time)
 ```
 
 ```
 
 |__/   |  x86_64-apple-darwin14.0.0
 
 julia @time for i=1:10^4 backtrace() end
 elapsed time: 2.586410216 seconds (640048 bytes allocated, 89.96% gc
 time)
 ```
 
 ```
 jameson@julia:~/julia-win32$ ./usr/bin/julia.exe -E  @time for i=1:10^3
 backtrace() end 
 fixme:winsock:WS_EnterSingleProtocolW unknown Protocol 0x
 fixme:winsock:WS_EnterSingleProtocolW unknown Protocol 0x
 err:dbghelp_stabs:stabs_parse Unknown stab type 0x0a
 elapsed time: 22.6314386 seconds (320032016 bytes allocated, 1.51% gc time)
 ```
 
 ```
 
 |__/   |  i686-w64-mingw32
 
 julia @time for i=1:10^4 backtrace() end
 elapsed time: 69.243275608 seconds (3200320800 bytes allocated, 13.16% gc
 time)
 ```
 
 And yes, those gc fractions are verifiably correct. With gc_disable(), they
 execute in 1/10 of the time. So, that pretty much means you must take 1/100
 of the samples if you want to preserve roughly the same slow down. On
 linux, I find the slowdown to be in the range of 2-5x, and consider that to
 be pretty reasonable, especially for what you're getting. If you took the
 same number of samples on windows, it would cause a 200-500x slowdown (give
 or take a few percent). If you wanted to offload this work to other cores
 to get the same level of accuracy and no more slowdown than linux, you
 would need a machine with 200-500 processors (give or take 2-5)!
 
 (I think I did those conversions correctly. However, since I just did them
 for the purposes of this email, sans calculator, and as I was typing, let
 me know if I made more than a factor of 2 error somewhere, or just have fun
 reading https://what-if.xkcd.com/84/ instead)
 
 On Tue Dec 02 2014 at 6:23:07 PM Tim Holy tim.h...@gmail.com wrote:
  On Tuesday, December 02, 2014 10:24:43 PM Jameson Nash wrote:
   You can't profile a moving target. The thread must be frozen first to
   ensure the stack trace doesn't change while attempting to record it
  
  Got it. I assume there's no good way to make a copy and then discard if
  the
  copy is bad?
  
  --Tim
  
   On Tue, Dec 2, 2014 at 5:12 PM Tim Holy tim.h...@gmail.com wrote:
If the work of walking the stack is done in the thread, why does it
  
  cause
  
any
slowdown of the main process?

But of course the time it takes to complete the backtrace sets an
upper
limit
on how frequently you can take a snapshot. In that case, though,
  
  couldn't
  
you
just have the thread always collecting backtraces?

--Tim

On Tuesday, December 02, 2014 09:54:17 PM Jameson Nash wrote:
 That's essentially what we do now. (Minus the busy wait part). The

overhead

 is too high to run it any more frequently -- it already causes a
 significant performance penalty on the system, even at the much
 lower
 sample rate than linux. However, I suspect the truncated backtraces
  
  on
  
 win32 were exaggerating the effect somewhat -- that should not be as
 much
 of an issue now.
 
 Sure, windows lets you snoop on (and modify) the address space of
 any
 process, you just need to find the right handle.
 
 On Tue, Dec 2, 2014 at 2:18 PM Tim Holy tim.h...@gmail.com wrote:
  On Windows, is there any chance that one could set up a separate
  thread
  for
  profiling and use busy-wait to do the timing? (I don't even know

whether

  one
  thread can snoop on the execution state of another thread.)
  
  --Tim
  
  On Tuesday, December 02, 2014 06:22:39 PM Jameson Nash wrote:
   Although, over thanksgiving, I pushed a number of fixes which
  
  should
  
   improve the quality of backtraces on win32 (and make sys.dll
  
  usable
  
  there)
  
   On Tue, Dec 2, 2014 at 1:20 PM Jameson Nash vtjn...@gmail.com

wrote:
Correct. Windows imposes a much higher overhead on just about
every
  
  aspect
  
of doing profiling. Unfortunately, there isn't much we can do
about
  
  this,
  
other then to complain to Microsoft. (It doesn't have signals,
  
  so
  
we

  must
  
emulate them with a separate thread. The accuracy of windows

timers is

somewhat questionable. And the stack walk library (for
  
  

Re: [julia-users] Re: Simple Finite Difference Methods

2014-12-02 Thread Jameson Nash
(I forgot to mention, that, to be fair, the windows machine that was used
to run this test was an underpowered dual-core hyperthreaded atom
processor, whereas the linux and mac machines were pretty comparable Xeon
and sandybridge machines, respectively. I only gave windows a factor of 2
advantage in the above computation in my accounting for this gap)
On Tue Dec 02 2014 at 10:50:20 PM Tim Holy tim.h...@gmail.com wrote:

 Wow, those are pathetically-slow backtraces. Since most of us don't have
 machines with 500 cores, I don't see anything we can do.

 --Tim

 On Wednesday, December 03, 2014 03:14:02 AM Jameson Nash wrote:
  you could copy the whole stack (typically only a few 100kb, max of 8MB),
  then do the stack walk offline. if you could change the stack pages to
  copy-on-write, it may even not be too expensive.
 
  but this is the real problem:
 
  ```
 
  |__/   |  x86_64-linux-gnu
 
  julia @time for i=1:10^4 backtrace() end
  elapsed time: 2.789268693 seconds (3200320016 bytes allocated, 89.29% gc
  time)
  ```
 
  ```
 
  |__/   |  x86_64-apple-darwin14.0.0
 
  julia @time for i=1:10^4 backtrace() end
  elapsed time: 2.586410216 seconds (640048 bytes allocated, 89.96% gc
  time)
  ```
 
  ```
  jameson@julia:~/julia-win32$ ./usr/bin/julia.exe -E  @time for i=1:10^3
  backtrace() end 
  fixme:winsock:WS_EnterSingleProtocolW unknown Protocol 0x
  fixme:winsock:WS_EnterSingleProtocolW unknown Protocol 0x
  err:dbghelp_stabs:stabs_parse Unknown stab type 0x0a
  elapsed time: 22.6314386 seconds (320032016 bytes allocated, 1.51% gc
 time)
  ```
 
  ```
 
  |__/   |  i686-w64-mingw32
 
  julia @time for i=1:10^4 backtrace() end
  elapsed time: 69.243275608 seconds (3200320800 bytes allocated, 13.16% gc
  time)
  ```
 
  And yes, those gc fractions are verifiably correct. With gc_disable(),
 they
  execute in 1/10 of the time. So, that pretty much means you must take
 1/100
  of the samples if you want to preserve roughly the same slow down. On
  linux, I find the slowdown to be in the range of 2-5x, and consider that
 to
  be pretty reasonable, especially for what you're getting. If you took the
  same number of samples on windows, it would cause a 200-500x slowdown
 (give
  or take a few percent). If you wanted to offload this work to other cores
  to get the same level of accuracy and no more slowdown than linux, you
  would need a machine with 200-500 processors (give or take 2-5)!
 
  (I think I did those conversions correctly. However, since I just did
 them
  for the purposes of this email, sans calculator, and as I was typing, let
  me know if I made more than a factor of 2 error somewhere, or just have
 fun
  reading https://what-if.xkcd.com/84/ instead)
 
  On Tue Dec 02 2014 at 6:23:07 PM Tim Holy tim.h...@gmail.com wrote:
   On Tuesday, December 02, 2014 10:24:43 PM Jameson Nash wrote:
You can't profile a moving target. The thread must be frozen first to
ensure the stack trace doesn't change while attempting to record it
  
   Got it. I assume there's no good way to make a copy and then discard
 if
   the
   copy is bad?
  
   --Tim
  
On Tue, Dec 2, 2014 at 5:12 PM Tim Holy tim.h...@gmail.com wrote:
 If the work of walking the stack is done in the thread, why does it
  
   cause
  
 any
 slowdown of the main process?

 But of course the time it takes to complete the backtrace sets an
 upper
 limit
 on how frequently you can take a snapshot. In that case, though,
  
   couldn't
  
 you
 just have the thread always collecting backtraces?

 --Tim

 On Tuesday, December 02, 2014 09:54:17 PM Jameson Nash wrote:
  That's essentially what we do now. (Minus the busy wait part).
 The

 overhead

  is too high to run it any more frequently -- it already causes a
  significant performance penalty on the system, even at the much
  lower
  sample rate than linux. However, I suspect the truncated
 backtraces
  
   on
  
  win32 were exaggerating the effect somewhat -- that should not
 be as
  much
  of an issue now.
 
  Sure, windows lets you snoop on (and modify) the address space of
  any
  process, you just need to find the right handle.
 
  On Tue, Dec 2, 2014 at 2:18 PM Tim Holy tim.h...@gmail.com
 wrote:
   On Windows, is there any chance that one could set up a
 separate
   thread
   for
   profiling and use busy-wait to do the timing? (I don't even
 know

 whether

   one
   thread can snoop on the execution state of another thread.)
  
   --Tim
  
   On Tuesday, December 02, 2014 06:22:39 PM Jameson Nash wrote:
Although, over thanksgiving, I pushed a number of fixes which
  
   should
  
improve the quality of backtraces on win32 (and make sys.dll
  
   usable
  
   there)
  
On Tue, Dec 2, 2014 at 1:20 PM 

[julia-users] Re: Simple Finite Difference Methods

2014-12-01 Thread Christoph Ortner
Hi Valentin,

This is extremely helpful - thank you for providing these tweaks. I hope 
you don't mind if I incorporate this.

(And sorry for the typos - I am normally better at testing.)

Christoph




On Sunday, 30 November 2014 20:54:40 UTC, Valentin Churavy wrote:

 I found a second error in lj_cstyle

 t is calculated wrongly:
  t = 1./s*s*s != 1/s^3

 It probably should be t = 1.0 / (s * s * s)

 t = 1.0 / (s*s*s)
 E += t*t - 2.*t
 dJ = -12.0 *(t*t - t) / s

 cstyle is on my machine still two times faster then my optimized variant 
 of jl_pretty 

 Best Valentin

 On Sunday, 30 November 2014 20:54:58 UTC+1, Valentin Churavy wrote:

 Nice work!

 Regarding the pretty Julia version of Lennard-Jones MD.

 You can shape of another second (on my machine) by not passing in the lj 
 method as a parameter, but directly calling it. 

 I tried to write an optimize version of your lj_pretty function by 
 analysing it with @profile and rewriting the slow parts. You can see my 
 results here: https://gist.github.com/vchuravy/f42f458717a7a49395a5 
 I went step for step through it and applied one optimization at a time. 
 You can also see the time computation time spend at each line as a comment. 
 Mostly I just removed temporary array allocation and then applied your math 
 optimization.

 One question though. In lj_cstyle(x) you calculate dJ = -12.*(t*t - t) * 
 s , shouldn't it be dJ = -12.*(t*t - t) / s? 

 Kind regards,
 Valentin





 On Sunday, 30 November 2014 12:51:31 UTC+1, Christoph Ortner wrote:

 Belated update to this thread:

 I have now finished a first draft of three tutorial-like numerical PDE 
 notebooks; they can be viewed at 
  http://homepages.warwick.ac.uk/staff/C.Ortner/index.php?page=julia
 I have two more coming up in the near future, one on spectral methods, 
 the other on an optimisation problem. For the moment, I am using them 
 primarily for my research group to learn Julia, and to show it to 
 colleagues when they are interested.  

 Q1: May I use the Julia logo on that website, as well as for any 
 tutorials / courses that I teach based on Julia?

 Q2: Eventually I think it would be good to have a Julia Examples page 
 such as
 http://www.mathworks.com/examples/

 Q3: I'd of course be interested in feedback.








[julia-users] Re: Simple Finite Difference Methods

2014-12-01 Thread Valentin Churavy
You are very welcome. I hope that your collection of notebooks continues to 
grow.
On Monday, 1 December 2014 14:09:22 UTC+1, Christoph Ortner wrote:

 Hi Valentin,

 This is extremely helpful - thank you for providing these tweaks. I hope 
 you don't mind if I incorporate this.

 (And sorry for the typos - I am normally better at testing.)

 Christoph




 On Sunday, 30 November 2014 20:54:40 UTC, Valentin Churavy wrote:

 I found a second error in lj_cstyle

 t is calculated wrongly:
  t = 1./s*s*s != 1/s^3

 It probably should be t = 1.0 / (s * s * s)

 t = 1.0 / (s*s*s)
 E += t*t - 2.*t
 dJ = -12.0 *(t*t - t) / s

 cstyle is on my machine still two times faster then my optimized variant 
 of jl_pretty 

 Best Valentin

 On Sunday, 30 November 2014 20:54:58 UTC+1, Valentin Churavy wrote:

 Nice work!

 Regarding the pretty Julia version of Lennard-Jones MD.

 You can shape of another second (on my machine) by not passing in the lj 
 method as a parameter, but directly calling it. 

 I tried to write an optimize version of your lj_pretty function by 
 analysing it with @profile and rewriting the slow parts. You can see my 
 results here: https://gist.github.com/vchuravy/f42f458717a7a49395a5 
 I went step for step through it and applied one optimization at a time. 
 You can also see the time computation time spend at each line as a comment. 
 Mostly I just removed temporary array allocation and then applied your math 
 optimization.

 One question though. In lj_cstyle(x) you calculate dJ = -12.*(t*t - t) 
 * s , shouldn't it be dJ = -12.*(t*t - t) / s? 

 Kind regards,
 Valentin





 On Sunday, 30 November 2014 12:51:31 UTC+1, Christoph Ortner wrote:

 Belated update to this thread:

 I have now finished a first draft of three tutorial-like numerical PDE 
 notebooks; they can be viewed at 
  http://homepages.warwick.ac.uk/staff/C.Ortner/index.php?page=julia
 I have two more coming up in the near future, one on spectral methods, 
 the other on an optimisation problem. For the moment, I am using them 
 primarily for my research group to learn Julia, and to show it to 
 colleagues when they are interested.  

 Q1: May I use the Julia logo on that website, as well as for any 
 tutorials / courses that I teach based on Julia?

 Q2: Eventually I think it would be good to have a Julia Examples page 
 such as
 http://www.mathworks.com/examples/

 Q3: I'd of course be interested in feedback.








[julia-users] Re: Simple Finite Difference Methods

2014-12-01 Thread Christoph Ortner
Hi Steven,
Thanks for the feedback - yes it was intentional, but I am definitely 
planning to extend the notebook with performance enhancements.
I was not yet aware of this distinction between Int and Integer. Thanks for 
pointing it out.
Christoph


On Monday, 1 December 2014 15:49:10 UTC, Steven G. Johnson wrote:

 I notice that in your finite-element notebook you use:

 T = zeros(Integer, 3, 2*N^2)


 You really want to use Int in cases like this (Int = Int64 on 64-bit 
 machines and Int32 on 32-bit machines).  Using Integer means that you have 
 an array of pointers to generic integer containers, whose type must be 
 checked at runtime (e.g. T[i] could be an Int32, Int64, BigInt, etc.).

 (You also have very Matlab-like code that allocates zillions of little 
 temporary arrays in your inner loops.  I assume this is intentional, but it 
 is definitely suboptimal performance, possibly by an order of magnitude.)



Re: [julia-users] Re: Simple Finite Difference Methods

2014-12-01 Thread Jiahao Chen
I took a look at the Lennard-Jones example. On top of the memory allocation
from array slicing, there is also a lot of overhead in passing lj(r) as an
input function phi. There is also some overhead from computing size(x) in
the range for the inner loop.

Consider this version of the code, which is C style in that the loops are
explicitly unrolled, but otherwise is not that far from the original
lj_pretty:

# definition of the Lennard-Jones potential
function LJ(r)
r⁻² = sumabs2(r)
r⁻⁶ = 1.0/(r⁻²*r⁻²*r⁻²)
J = r⁻⁶*r⁻⁶ - 2.0r⁻⁶
∇ᵣJ = -12. * (r⁻⁶*r⁻⁶ - r⁻⁶) / r⁻²
J, ∇ᵣJ #--
end

function lj_cstyle2(x)
d, N = size(x)
E = 0.0
∇E = zeros(x) #note that zeros supports this form too
r = zeros(d)
@inbounds for n = 1:(N-1), k = (n+1):N
for i = 1:d
r[i] = x[i,k]-x[i,n]
end
J, ∇ᵣJ = LJ(r) #--
E += J
for i = 1:d
∇E[i, k] += ∇ᵣJ * r[i]
∇E[i, n] -= ∇ᵣJ * r[i]
end
end
E, ∇E #return keyword not strictly required
end

For me, lj_cstyle2 runs in 83 ms, compared to 66 ms for lj_cstyle.

There is some overhead from constructing and unpacking the tuples in the
two lines marked with  #--, but I think it is unavoidable if you want to
have a single external function that computes both the potential and its
gradient.

Note that this version of LJ(r) computes two numbers, not a number and a
vector. For any central potential the gradient along r, ∇ᵣJ,  is sufficient
to characterize the entire force ∇E,  so there is not much loss of
generality in not returning the force in the original coordinates. In this
case, the mathematical structure of the problem is not entirely at odds
with its implementation in code.

Thanks,

Jiahao Chen
Staff Research Scientist
MIT Computer Science and Artificial Intelligence Laboratory

On Mon, Dec 1, 2014 at 12:14 PM, Christoph Ortner 
christophortn...@gmail.com wrote:

 Hi Steven,
 Thanks for the feedback - yes it was intentional, but I am definitely
 planning to extend the notebook with performance enhancements.
 I was not yet aware of this distinction between Int and Integer. Thanks
 for pointing it out.
 Christoph


 On Monday, 1 December 2014 15:49:10 UTC, Steven G. Johnson wrote:

 I notice that in your finite-element notebook you use:

 T = zeros(Integer, 3, 2*N^2)


 You really want to use Int in cases like this (Int = Int64 on 64-bit
 machines and Int32 on 32-bit machines).  Using Integer means that you have
 an array of pointers to generic integer containers, whose type must be
 checked at runtime (e.g. T[i] could be an Int32, Int64, BigInt, etc.).

 (You also have very Matlab-like code that allocates zillions of little
 temporary arrays in your inner loops.  I assume this is intentional, but it
 is definitely suboptimal performance, possibly by an order of magnitude.)




Re: [julia-users] Re: Simple Finite Difference Methods

2014-12-01 Thread Jiahao Chen
* r⁻² should be r², of course.

Note that the following very naive version of LJ(r)

function LJ(r)
r² = sumabs2(r)
J = 1.0/(r²)^6 - 2.0/(r²)^3
∇ᵣJ = (-12.0/(r²)^6 + 12.0/(r²)^3) / r²
return J, ∇ᵣJ
end

results in code that runs in 0.42s, which is not terrible (and such
performance would be expected for more complicated Lennard-Jones potentials
which are not 12-6).

Thanks,

Jiahao Chen
Staff Research Scientist
MIT Computer Science and Artificial Intelligence Laboratory

On Mon, Dec 1, 2014 at 9:40 PM, Jiahao Chen jia...@mit.edu wrote:

 I took a look at the Lennard-Jones example. On top of the memory
 allocation from array slicing, there is also a lot of overhead in passing
 lj(r) as an input function phi. There is also some overhead from computing
 size(x) in the range for the inner loop.

 Consider this version of the code, which is C style in that the loops
 are explicitly unrolled, but otherwise is not that far from the original
 lj_pretty:

 # definition of the Lennard-Jones potential
 function LJ(r)
 r⁻² = sumabs2(r)
 r⁻⁶ = 1.0/(r⁻²*r⁻²*r⁻²)
 J = r⁻⁶*r⁻⁶ - 2.0r⁻⁶
 ∇ᵣJ = -12. * (r⁻⁶*r⁻⁶ - r⁻⁶) / r⁻²
 J, ∇ᵣJ #--
 end

 function lj_cstyle2(x)
 d, N = size(x)
 E = 0.0
 ∇E = zeros(x) #note that zeros supports this form too
 r = zeros(d)
 @inbounds for n = 1:(N-1), k = (n+1):N
 for i = 1:d
 r[i] = x[i,k]-x[i,n]
 end
 J, ∇ᵣJ = LJ(r) #--
 E += J
 for i = 1:d
 ∇E[i, k] += ∇ᵣJ * r[i]
 ∇E[i, n] -= ∇ᵣJ * r[i]
 end
 end
 E, ∇E #return keyword not strictly required
 end

 For me, lj_cstyle2 runs in 83 ms, compared to 66 ms for lj_cstyle.

 There is some overhead from constructing and unpacking the tuples in the
 two lines marked with  #--, but I think it is unavoidable if you want to
 have a single external function that computes both the potential and its
 gradient.

 Note that this version of LJ(r) computes two numbers, not a number and a
 vector. For any central potential the gradient along r, ∇ᵣJ,  is sufficient
 to characterize the entire force ∇E,  so there is not much loss of
 generality in not returning the force in the original coordinates. In this
 case, the mathematical structure of the problem is not entirely at odds
 with its implementation in code.

 Thanks,

 Jiahao Chen
 Staff Research Scientist
 MIT Computer Science and Artificial Intelligence Laboratory

 On Mon, Dec 1, 2014 at 12:14 PM, Christoph Ortner 
 christophortn...@gmail.com wrote:

 Hi Steven,
 Thanks for the feedback - yes it was intentional, but I am definitely
 planning to extend the notebook with performance enhancements.
 I was not yet aware of this distinction between Int and Integer. Thanks
 for pointing it out.
 Christoph


 On Monday, 1 December 2014 15:49:10 UTC, Steven G. Johnson wrote:

 I notice that in your finite-element notebook you use:

 T = zeros(Integer, 3, 2*N^2)


 You really want to use Int in cases like this (Int = Int64 on 64-bit
 machines and Int32 on 32-bit machines).  Using Integer means that you have
 an array of pointers to generic integer containers, whose type must be
 checked at runtime (e.g. T[i] could be an Int32, Int64, BigInt, etc.).

 (You also have very Matlab-like code that allocates zillions of little
 temporary arrays in your inner loops.  I assume this is intentional, but it
 is definitely suboptimal performance, possibly by an order of magnitude.)





Re: [julia-users] Re: Simple Finite Difference Methods

2014-12-01 Thread Christoph Ortner

Thank you for these ideas. I particularly like how to use unicode - I may 
need to learn how to do this.

Having to manually write out all these loops is a bit of a pain. Is @devec 
not doing the job?

You seem to say that I shouldn't use the return keyword - I like for 
readability. But is there a Julia style guide I should start following?

Christoph




Re: [julia-users] Re: Simple Finite Difference Methods

2014-12-01 Thread Christoph Ortner



I took a look at the Lennard-Jones example. On top of the memory allocation 
 from array slicing, there is also a lot of overhead in passing lj(r) as an 
 input function phi. 


this is counterintuitive to me. Why is this a problem?

 Christoph 


[julia-users] Re: Simple Finite Difference Methods

2014-12-01 Thread Christoph Ortner

How do you get timings from the Julia profiler, or even better, %-es? I 
guess one can convert from the numbers one gets, but it is a bit painful?

Christoph



[julia-users] Re: Simple Finite Difference Methods

2014-11-30 Thread Christoph Ortner
Belated update to this thread:

I have now finished a first draft of three tutorial-like numerical PDE 
notebooks; they can be viewed at 
 http://homepages.warwick.ac.uk/staff/C.Ortner/index.php?page=julia
I have two more coming up in the near future, one on spectral methods, the 
other on an optimisation problem. For the moment, I am using them primarily 
for my research group to learn Julia, and to show it to colleagues when 
they are interested.  

Q1: May I use the Julia logo on that website, as well as for any tutorials 
/ courses that I teach based on Julia?

Q2: Eventually I think it would be good to have a Julia Examples page 
such as
http://www.mathworks.com/examples/

Q3: I'd of course be interested in feedback.








[julia-users] Re: Simple Finite Difference Methods

2014-11-30 Thread Valentin Churavy
Nice work!

Regarding the pretty Julia version of Lennard-Jones MD.

You can shape of another second (on my machine) by not passing in the lj 
method as a parameter, but directly calling it. 

I tried to write an optimize version of your lj_pretty function by 
analysing it with @profile and rewriting the slow parts. You can see my 
results here: https://gist.github.com/vchuravy/f42f458717a7a49395a5 
I went step for step through it and applied one optimization at a time. You 
can also see the time computation time spend at each line as a comment. 
Mostly I just removed temporary array allocation and then applied your math 
optimization.

One question though. In lj_cstyle(x) you calculate dJ = -12.*(t*t - t) * s 
, shouldn't it be dJ = -12.*(t*t - t) / s? 

Kind regards,
Valentin





On Sunday, 30 November 2014 12:51:31 UTC+1, Christoph Ortner wrote:

 Belated update to this thread:

 I have now finished a first draft of three tutorial-like numerical PDE 
 notebooks; they can be viewed at 
  http://homepages.warwick.ac.uk/staff/C.Ortner/index.php?page=julia
 I have two more coming up in the near future, one on spectral methods, the 
 other on an optimisation problem. For the moment, I am using them primarily 
 for my research group to learn Julia, and to show it to colleagues when 
 they are interested.  

 Q1: May I use the Julia logo on that website, as well as for any tutorials 
 / courses that I teach based on Julia?

 Q2: Eventually I think it would be good to have a Julia Examples page 
 such as
 http://www.mathworks.com/examples/

 Q3: I'd of course be interested in feedback.








[julia-users] Re: Simple Finite Difference Methods

2014-11-30 Thread Valentin Churavy
I found a second error in lj_cstyle

t is calculated wrongly:
 t = 1./s*s*s != 1/s^3

It probably should be t = 1.0 / (s * s * s)

t = 1.0 / (s*s*s)
E += t*t - 2.*t
dJ = -12.0 *(t*t - t) / s

cstyle is on my machine still two times faster then my optimized variant of 
jl_pretty 

Best Valentin

On Sunday, 30 November 2014 20:54:58 UTC+1, Valentin Churavy wrote:

 Nice work!

 Regarding the pretty Julia version of Lennard-Jones MD.

 You can shape of another second (on my machine) by not passing in the lj 
 method as a parameter, but directly calling it. 

 I tried to write an optimize version of your lj_pretty function by 
 analysing it with @profile and rewriting the slow parts. You can see my 
 results here: https://gist.github.com/vchuravy/f42f458717a7a49395a5 
 I went step for step through it and applied one optimization at a time. 
 You can also see the time computation time spend at each line as a comment. 
 Mostly I just removed temporary array allocation and then applied your math 
 optimization.

 One question though. In lj_cstyle(x) you calculate dJ = -12.*(t*t - t) * 
 s , shouldn't it be dJ = -12.*(t*t - t) / s? 

 Kind regards,
 Valentin





 On Sunday, 30 November 2014 12:51:31 UTC+1, Christoph Ortner wrote:

 Belated update to this thread:

 I have now finished a first draft of three tutorial-like numerical PDE 
 notebooks; they can be viewed at 
  http://homepages.warwick.ac.uk/staff/C.Ortner/index.php?page=julia
 I have two more coming up in the near future, one on spectral methods, 
 the other on an optimisation problem. For the moment, I am using them 
 primarily for my research group to learn Julia, and to show it to 
 colleagues when they are interested.  

 Q1: May I use the Julia logo on that website, as well as for any 
 tutorials / courses that I teach based on Julia?

 Q2: Eventually I think it would be good to have a Julia Examples page 
 such as
 http://www.mathworks.com/examples/

 Q3: I'd of course be interested in feedback.








[julia-users] Re: Simple Finite Difference Methods

2014-11-21 Thread Steven G. Johnson
I prefer to construct multidimensional Laplacian matrices from the 1d ones 
via Kronecker products, and the 1d ones from -D'*D where D is a 1d 
first-derivative matrix, to make the structure (symmetric-definite!) and 
origin of the matrices clearer.   I've posted a notebook showing some 
examples from my 18.303 class at MIT:

http://nbviewer.ipython.org/url/math.mit.edu/~stevenj/18.303/lecture-10.ipynb

(It would be nicer to construct the sdiff1 function so that the matrix is 
sparse from the beginning, rather than converting from a dense matrix.  But 
I was lazy and dense matrices on 1d grids are cheap.)


[julia-users] Re: Simple Finite Difference Methods

2014-11-20 Thread Christoph Ortner
Turns out this was *very* quick to translate from the Matlab codes. If 
there is any interest I can post my IJulia notebook somewhere.

On Thursday, 20 November 2014 15:08:30 UTC, Christoph Ortner wrote:

 I am trying to port 
 http://uk.mathworks.com/examples/matlab/1091-finite-difference-laplacian
 to Julia for a quick Julia intro for some grad students at my department. 
 Has anybody ported `numgrid` and `delsq` ?

 Thanks,
Christoph



Re: [julia-users] Re: Simple Finite Difference Methods

2014-11-20 Thread Christoph Ortner
I've put the current version at
https://dl.dropboxusercontent.com/u/9561945/JuliaNBs/Laplacian.ipynb

Would there be any interest in porting a larger set of Matlab examples to 
Julia? I wouldn't mind porting a subset of the Mathematics section on
http://uk.mathworks.com/examples/matlab

This might be a small way how I could contribute here, as so far I've 
mostly been enjoying the benefits of everyone else's hard work :).

Christoph





On Thursday, 20 November 2014 17:33:25 UTC, Stefan Karpinski wrote:

 Yes, translation from Matlab to Julia is generally pretty easy. Would be 
 cool to take a look at at the code you ended up with.

 On Thu, Nov 20, 2014 at 11:54 AM, Christoph Ortner christop...@gmail.com 
 javascript: wrote:

 Turns out this was *very* quick to translate from the Matlab codes. If 
 there is any interest I can post my IJulia notebook somewhere.


 On Thursday, 20 November 2014 15:08:30 UTC, Christoph Ortner wrote:

 I am trying to port 
 http://uk.mathworks.com/examples/matlab/1091-finite-difference-laplacian
 to Julia for a quick Julia intro for some grad students at my 
 department. Has anybody ported `numgrid` and `delsq` ?

 Thanks,
Christoph




Re: [julia-users] Re: Simple Finite Difference Methods

2014-11-20 Thread Stefan Karpinski
I would be quite wary of material that may be copyright MathWorks. You may not 
be legally allowed to port it or even download it.


 On Nov 20, 2014, at 2:23 PM, Christoph Ortner christophortn...@gmail.com 
 wrote:
 
 I've put the current version at
 https://dl.dropboxusercontent.com/u/9561945/JuliaNBs/Laplacian.ipynb
 
 Would there be any interest in porting a larger set of Matlab examples to 
 Julia? I wouldn't mind porting a subset of the Mathematics section on
 http://uk.mathworks.com/examples/matlab
 
 This might be a small way how I could contribute here, as so far I've mostly 
 been enjoying the benefits of everyone else's hard work :).
 
 Christoph
 
 
 
 
 
 On Thursday, 20 November 2014 17:33:25 UTC, Stefan Karpinski wrote:
 Yes, translation from Matlab to Julia is generally pretty easy. Would be 
 cool to take a look at at the code you ended up with.
 
 On Thu, Nov 20, 2014 at 11:54 AM, Christoph Ortner christop...@gmail.com 
 wrote:
 Turns out this was *very* quick to translate from the Matlab codes. If 
 there is any interest I can post my IJulia notebook somewhere.
 
 
 On Thursday, 20 November 2014 15:08:30 UTC, Christoph Ortner wrote:
 I am trying to port 
 http://uk.mathworks.com/examples/matlab/1091-finite-difference-laplacian
 to Julia for a quick Julia intro for some grad students at my department. 
 Has anybody ported `numgrid` and `delsq` ?
 
 Thanks,
Christoph
 


Re: [julia-users] Re: Simple Finite Difference Methods

2014-11-20 Thread Mauro
The usual copyright issues apply here though: numgird and delsq are
copyrighted Matlab functions which you are not allowed to copy and even
less allowed post copied code.  Best re-implement them without looking
at the source code!

Same goes for all of http://uk.mathworks.com/examples/matlab

On Thu, 2014-11-20 at 20:23, Christoph Ortner christophortn...@gmail.com 
wrote:
 I've put the current version at
 https://dl.dropboxusercontent.com/u/9561945/JuliaNBs/Laplacian.ipynb

 Would there be any interest in porting a larger set of Matlab examples to 
 Julia? I wouldn't mind porting a subset of the Mathematics section on
 http://uk.mathworks.com/examples/matlab

 This might be a small way how I could contribute here, as so far I've 
 mostly been enjoying the benefits of everyone else's hard work :).

 Christoph





 On Thursday, 20 November 2014 17:33:25 UTC, Stefan Karpinski wrote:

 Yes, translation from Matlab to Julia is generally pretty easy. Would be 
 cool to take a look at at the code you ended up with.

 On Thu, Nov 20, 2014 at 11:54 AM, Christoph Ortner christop...@gmail.com 
 javascript: wrote:

 Turns out this was *very* quick to translate from the Matlab codes. If 
 there is any interest I can post my IJulia notebook somewhere.


 On Thursday, 20 November 2014 15:08:30 UTC, Christoph Ortner wrote:

 I am trying to port 
 http://uk.mathworks.com/examples/matlab/1091-finite-difference-laplacian
 to Julia for a quick Julia intro for some grad students at my 
 department. Has anybody ported `numgrid` and `delsq` ?

 Thanks,
Christoph






Re: [julia-users] Re: Simple Finite Difference Methods

2014-11-20 Thread Christoph Ortner
too bad. Anyhow, if I continue to develop a similar set of elementary 
tutorials, I will make them available.
   Christoph


On Thursday, 20 November 2014 19:29:34 UTC, Mauro wrote:

 The usual copyright issues apply here though: numgird and delsq are 
 copyrighted Matlab functions which you are not allowed to copy and even 
 less allowed post copied code.  Best re-implement them without looking 
 at the source code! 

 Same goes for all of http://uk.mathworks.com/examples/matlab 

 On Thu, 2014-11-20 at 20:23, Christoph Ortner christop...@gmail.com 
 javascript: wrote: 
  I've put the current version at 
  https://dl.dropboxusercontent.com/u/9561945/JuliaNBs/Laplacian.ipynb 
  
  Would there be any interest in porting a larger set of Matlab examples 
 to 
  Julia? I wouldn't mind porting a subset of the Mathematics section on 
  http://uk.mathworks.com/examples/matlab 
  
  This might be a small way how I could contribute here, as so far I've 
  mostly been enjoying the benefits of everyone else's hard work :). 
  
  Christoph 
  
  
  
  
  
  On Thursday, 20 November 2014 17:33:25 UTC, Stefan Karpinski wrote: 
  
  Yes, translation from Matlab to Julia is generally pretty easy. Would 
 be 
  cool to take a look at at the code you ended up with. 
  
  On Thu, Nov 20, 2014 at 11:54 AM, Christoph Ortner 
 christop...@gmail.com 
  javascript: wrote: 
  
  Turns out this was *very* quick to translate from the Matlab codes. If 
  there is any interest I can post my IJulia notebook somewhere. 
  
  
  On Thursday, 20 November 2014 15:08:30 UTC, Christoph Ortner wrote: 
  
  I am trying to port 
  
 http://uk.mathworks.com/examples/matlab/1091-finite-difference-laplacian 
  to Julia for a quick Julia intro for some grad students at my 
  department. Has anybody ported `numgrid` and `delsq` ? 
  
  Thanks, 
 Christoph 
  
  
  



Re: [julia-users] Re: Simple Finite Difference Methods

2014-11-20 Thread Christoph Ortner
I've also removed the notebook.

On Thursday, 20 November 2014 20:15:51 UTC, Christoph Ortner wrote:

 too bad. Anyhow, if I continue to develop a similar set of elementary 
 tutorials, I will make them available.
Christoph


 On Thursday, 20 November 2014 19:29:34 UTC, Mauro wrote:

 The usual copyright issues apply here though: numgird and delsq are 
 copyrighted Matlab functions which you are not allowed to copy and even 
 less allowed post copied code.  Best re-implement them without looking 
 at the source code! 

 Same goes for all of http://uk.mathworks.com/examples/matlab 

 On Thu, 2014-11-20 at 20:23, Christoph Ortner christop...@gmail.com 
 wrote: 
  I've put the current version at 
  
 https://dl.dropboxusercontent.com/u/9561945/JuliaNBs/Laplacian.ipynb 
  
  Would there be any interest in porting a larger set of Matlab examples 
 to 
  Julia? I wouldn't mind porting a subset of the Mathematics section on 
  http://uk.mathworks.com/examples/matlab 
  
  This might be a small way how I could contribute here, as so far I've 
  mostly been enjoying the benefits of everyone else's hard work :). 
  
  Christoph 
  
  
  
  
  
  On Thursday, 20 November 2014 17:33:25 UTC, Stefan Karpinski wrote: 
  
  Yes, translation from Matlab to Julia is generally pretty easy. Would 
 be 
  cool to take a look at at the code you ended up with. 
  
  On Thu, Nov 20, 2014 at 11:54 AM, Christoph Ortner 
 christop...@gmail.com 
  javascript: wrote: 
  
  Turns out this was *very* quick to translate from the Matlab codes. 
 If 
  there is any interest I can post my IJulia notebook somewhere. 
  
  
  On Thursday, 20 November 2014 15:08:30 UTC, Christoph Ortner wrote: 
  
  I am trying to port 
  
 http://uk.mathworks.com/examples/matlab/1091-finite-difference-laplacian 
  to Julia for a quick Julia intro for some grad students at my 
  department. Has anybody ported `numgrid` and `delsq` ? 
  
  Thanks, 
 Christoph 
  
  
  



Re: [julia-users] Re: Simple Finite Difference Methods

2014-11-20 Thread Mauro
The whole copyright stuff is a pain!  IANAL, but as far as I understand
it:
- it's ok to copy examples which have some open source licence, probably
  needing to keep that licence.
- it's ok to re-do someone elses examples as long as you don't look at
  source code.  However, that is probably a bit tricky as examples are
  usually mixed with code.
- if unsure, ask the original author.  This should work fine for
  academics and the like but probably less well for Mathworks.

This could be a PDE example to look into:
http://wiki.scipy.org/PerformancePython

Tutorials/examples would definitely be useful!

On Thu, 2014-11-20 at 21:17, Christoph Ortner christophortn...@gmail.com 
wrote:
 I've also removed the notebook.

 On Thursday, 20 November 2014 20:15:51 UTC, Christoph Ortner wrote:

 too bad. Anyhow, if I continue to develop a similar set of elementary 
 tutorials, I will make them available.
Christoph


 On Thursday, 20 November 2014 19:29:34 UTC, Mauro wrote:

 The usual copyright issues apply here though: numgird and delsq are 
 copyrighted Matlab functions which you are not allowed to copy and even 
 less allowed post copied code.  Best re-implement them without looking 
 at the source code! 

 Same goes for all of http://uk.mathworks.com/examples/matlab 

 On Thu, 2014-11-20 at 20:23, Christoph Ortner christop...@gmail.com 
 wrote: 
  I've put the current version at 
  
 https://dl.dropboxusercontent.com/u/9561945/JuliaNBs/Laplacian.ipynb 
  
  Would there be any interest in porting a larger set of Matlab examples 
 to 
  Julia? I wouldn't mind porting a subset of the Mathematics section on 
  http://uk.mathworks.com/examples/matlab 
  
  This might be a small way how I could contribute here, as so far I've 
  mostly been enjoying the benefits of everyone else's hard work :). 
  
  Christoph 
  
  
  
  
  
  On Thursday, 20 November 2014 17:33:25 UTC, Stefan Karpinski wrote: 
  
  Yes, translation from Matlab to Julia is generally pretty easy. Would 
 be 
  cool to take a look at at the code you ended up with. 
  
  On Thu, Nov 20, 2014 at 11:54 AM, Christoph Ortner 
 christop...@gmail.com 
  javascript: wrote: 
  
  Turns out this was *very* quick to translate from the Matlab codes. 
 If 
  there is any interest I can post my IJulia notebook somewhere. 
  
  
  On Thursday, 20 November 2014 15:08:30 UTC, Christoph Ortner wrote: 
  
  I am trying to port 
  
 http://uk.mathworks.com/examples/matlab/1091-finite-difference-laplacian 
  to Julia for a quick Julia intro for some grad students at my 
  department. Has anybody ported `numgrid` and `delsq` ? 
  
  Thanks, 
 Christoph 
  
  
  





Re: [julia-users] Re: Simple Finite Difference Methods

2014-11-20 Thread Christoph Ortner
Good idea to start from the Python tutorials - I will look into that.
Christoph


On Thursday, 20 November 2014 20:45:01 UTC, Mauro wrote:

 The whole copyright stuff is a pain!  IANAL, but as far as I understand 
 it: 
 - it's ok to copy examples which have some open source licence, probably 
   needing to keep that licence. 
 - it's ok to re-do someone elses examples as long as you don't look at 
   source code.  However, that is probably a bit tricky as examples are 
   usually mixed with code. 
 - if unsure, ask the original author.  This should work fine for 
   academics and the like but probably less well for Mathworks. 

 This could be a PDE example to look into: 
 http://wiki.scipy.org/PerformancePython 

 Tutorials/examples would definitely be useful! 

 On Thu, 2014-11-20 at 21:17, Christoph Ortner christop...@gmail.com 
 javascript: wrote: 
  I've also removed the notebook. 
  
  On Thursday, 20 November 2014 20:15:51 UTC, Christoph Ortner wrote: 
  
  too bad. Anyhow, if I continue to develop a similar set of elementary 
  tutorials, I will make them available. 
 Christoph 
  
  
  On Thursday, 20 November 2014 19:29:34 UTC, Mauro wrote: 
  
  The usual copyright issues apply here though: numgird and delsq are 
  copyrighted Matlab functions which you are not allowed to copy and 
 even 
  less allowed post copied code.  Best re-implement them without looking 
  at the source code! 
  
  Same goes for all of http://uk.mathworks.com/examples/matlab 
  
  On Thu, 2014-11-20 at 20:23, Christoph Ortner christop...@gmail.com 
  wrote: 
   I've put the current version at 
   
  https://dl.dropboxusercontent.com/u/9561945/JuliaNBs/Laplacian.ipynb 
   
   Would there be any interest in porting a larger set of Matlab 
 examples 
  to 
   Julia? I wouldn't mind porting a subset of the Mathematics section 
 on 
   http://uk.mathworks.com/examples/matlab 
   
   This might be a small way how I could contribute here, as so far 
 I've 
   mostly been enjoying the benefits of everyone else's hard work :). 
   
   Christoph 
   
   
   
   
   
   On Thursday, 20 November 2014 17:33:25 UTC, Stefan Karpinski wrote: 
   
   Yes, translation from Matlab to Julia is generally pretty easy. 
 Would 
  be 
   cool to take a look at at the code you ended up with. 
   
   On Thu, Nov 20, 2014 at 11:54 AM, Christoph Ortner  
  christop...@gmail.com 
   javascript: wrote: 
   
   Turns out this was *very* quick to translate from the Matlab 
 codes. 
  If 
   there is any interest I can post my IJulia notebook somewhere. 
   
   
   On Thursday, 20 November 2014 15:08:30 UTC, Christoph Ortner 
 wrote: 
   
   I am trying to port 
   
  
 http://uk.mathworks.com/examples/matlab/1091-finite-difference-laplacian 
   to Julia for a quick Julia intro for some grad students at my 
   department. Has anybody ported `numgrid` and `delsq` ? 
   
   Thanks, 
  Christoph 
   
   
   
  
  



Re: [julia-users] Re: Simple Finite Difference Methods

2014-11-20 Thread Pontus Stenetorp
On 21 November 2014 05:35, Mauro mauro...@runbox.com wrote:

 - it's ok to re-do someone elses examples as long as you don't look at
   source code.  However, that is probably a bit tricky as examples are
   usually mixed with code.

Having been involved WiFi drivers back-in-the-day, it is insane what
kind of hacks you can pull off to dodge copyright.  The Wikipedia
article on Clean room design [1] is a pretty good read.

Pontus

[1]: http://en.wikipedia.org/wiki/Clean_room_design