Re: [Haskell-cafe] Mathematics and Statistics libraries

2012-03-25 Thread Tom Doris
Hi Heinrich,

If we compare the GHCi experience with R or IPython, leaving aside any
GUIs, the help system they have at the repl level is just a lot more
intuitive and easy to use, and you get access to the full manual
entries. For example, compare what you see if you type :info sort into
GHCi versus ?sort in R. R gives you a view of the full docs for the
function, whereas in GHCi you just get the type signature.

I usually def a command to call out to :!hoogle --info %, which
gives what you expect :info should. So, as is usually the case,
there's a solution in Haskell that matches the features in other
systems, but it's not the default and you have to invest effort
getting it set up right. This is fine for Haskell devs who do some
stats work, but it represents an offputtingly steep learning curve for
quants who are willing to learn a little Haskell but expect
(reasonably) some basic stuff like inline help to Just Work.

Tom

On 25 March 2012 08:26, Heinrich Apfelmus apfel...@quantentunnel.de wrote:
 Tom Doris wrote:


 If you're interested in UI work, ideally we'd have something similar
 to RStudio as an environment, a simple set of windows encapsulating an
 editor, a repl, a plotting panel and help/history, this sounds
 superficial but it really has an impact when you're exploring a data
 set and trying stuff out.


 Concerning UI, the following project suggestion aims to give GHCi a web GUI

  http://hackage.haskell.org/trac/summer-of-code/ticket/1609

 But one of your criteria is that a good UI should come with a help system,
 too, right?


 Best regards,
 Heinrich Apfelmus

 --
 http://apfelmus.nfshost.com



 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Mathematics and Statistics libraries

2012-03-24 Thread Tom Doris
If the goal is to help Haskell be a more acceptable choice for general
statistical analysis tasks, then  hmatrix, statistics, and the various
gsl wrappers already provide the majority of the functionality needed.
I think the bigger problem is that there is no guidance on which
libraries are industrial strength, and there's no glue layer making it
easier to use the APIs you'd want to, and GHCi isn't always ideal as a
repl for this workflow.

If you're interested in UI work, ideally we'd have something similar
to RStudio as an environment, a simple set of windows encapsulating an
editor, a repl, a plotting panel and help/history, this sounds
superficial but it really has an impact when you're exploring a data
set and trying stuff out. However, it would be a bigger contribution
to get us to the point where we are able to just import
Quant.Prelude to bring into scope all the standard functionality
assumed in an environment like R or Matlab. In my experience most of
this can come from re-exporting existing libraries while occasionally
wrapping functions to simplify the interfaces and make them more
consistent (e.g., a quant doesn't particularly need to know why
Statistics.Sample.KernelDensity.kde uses unboxed vectors when the rest
of that lib uses Generic, and they certainly won't want to spend their
time remembering that they need to convert to call that function).

As an exercise, in GHCi, try loading a few arbitrary csv files of
tables including floating point columns, do a linear regression of one
such column on another, and then display a scatterplot with the
regression line, maybe throw in a check for the normality of the
residuals. Assume you'll need to be able to handle large data sets so
you need to use bytestring, attoparsec etc; beware that there's a
known bug that will cause a segfault/bus error if you use some
hmatrix/gsl functions from GHCi on x86_64, which is kind of a blocker
in itself. Maybe I missed something obvious but it took me a looong
time to figure out which containers, persistence + parsing, stats and
plotting packages I should choose.

I really disagree that we need a data frame type structure; they're an
abomination in R, they try to accommodate event records and time
series, and do neither well. Haskell records are fine for
inhomogeneous event series and for homogeneous time series parallel
Vectors or Matrices are better as they can be passed to BLAS and
LAPACK with consequent performance and clarity advantages - column
oriented storage rocks, and Haskell is already a good fit.

Having used C++, Matlab and R (the latter for quite a while) I now use
Haskell for all of my statistical analysis work, despite the many
shortcomings it's definitely worth it for the code clarity and type
checking, to say nothing of the pre-optimization performance and
robustness.

Best of luck, happy to share some preliminary code with you directly
if you're interested!
Tom



On 21 March 2012 17:24, Ben Jones ben.jamin.pw...@gmail.com wrote:
 I am a student currently interested in participating in Google Summer of
 Code. I have a strong interest in Haskell, and a semester's worth of coding
 experience in the language. I am a mathematics and cs double major with only
 a semester left and I am looking for information regarding what the
 community is lacking as far as mathematics and statistics libraries are
 concerned. If there is enough interest I would like to put together a
 project with this. I understand that such libraries are probably low
 priority, but if anyone has anything I would love to hear it.

 Thanks for reading,
       -Benjamin

 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] empty fields are dropped in bytestring csv

2012-02-18 Thread Tom Doris
Hacky patch to fix this for future reference, against bytestring-csv-0.1.2,
cost center annotations used to anecdotally verify that the change doesn't
significantly impact performance, (interestingly the Alex lexer in
bytestring-csv appears to allocate 1.5GB while lexing a 1.6MB csv file!?)

Text/CSV/ByteString.hs

65c65
 fields   = [ unquote s | Item s - line ]
---
 fields   = [ unquote s | Item s - pline line]
76a77,86


 pline fs@(Item x : []) = fs
 pline (Item x : Comma : []) = {-# SCC plinea #-} Item x : Comma : Item
S.empty :  []
 pline (Item x : Comma : rs) = {-# SCC plineb #-} Item x : Comma : pline
rs
 pline (Comma : []) = {-# SCC plinec #-} Comma : Item S.empty : Comma :
Item S.empty : []
 pline (Comma : rs) = {-# SCC plined #-} Item S.empty : Comma : pline rs
 pline (Newline : rs ) = []
 pline [] = []



On 17 February 2012 23:16, Tom Doris tomdo...@gmail.com wrote:

 the bytestring-csv package appears to have a bug whereby empty fields are
 dropped completely from the row, which is different to Text.CSV , which
 will return an empty field in the parse result. I'd argue this is a bug in
 bytestring-csv, anyone know whether this has been raised before, or know of
 a workaround?

 Prelude Data.Maybe Data.List Text.CSV.ByteString Data.ByteString.Char8
 parseCSV $ pack a,b,c\n1,2,3\n1,,9\n
 Just [[a,b,c],[1,2,3],[1,9]]

 -- the last row has two fields ^

 Prelude Text.CSV parseCSV /tmp/err a,b,c\n1,2,3\n1,,9\n
 Right [[a,b,c],[1,2,3],[1,,9],[]]



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] empty fields are dropped in bytestring csv

2012-02-17 Thread Tom Doris
the bytestring-csv package appears to have a bug whereby empty fields are
dropped completely from the row, which is different to Text.CSV , which
will return an empty field in the parse result. I'd argue this is a bug in
bytestring-csv, anyone know whether this has been raised before, or know of
a workaround?

Prelude Data.Maybe Data.List Text.CSV.ByteString Data.ByteString.Char8
parseCSV $ pack a,b,c\n1,2,3\n1,,9\n
Just [[a,b,c],[1,2,3],[1,9]]

-- the last row has two fields ^

Prelude Text.CSV parseCSV /tmp/err a,b,c\n1,2,3\n1,,9\n
Right [[a,b,c],[1,2,3],[1,,9],[]]
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] hmatrix under ghci on x86_64

2012-02-13 Thread Tom Doris
I'm using ghci + hmatrix and a few other packages as a Haskell based
replacement for Matlab, everything works well so far in terms of available
functionality. However, I have encountered an issue when running in ghci on
x86_64 systems - calls into functions that in turn call gsl functions will
result in a bus error, e.g:

Prelude :m +Numeric.Container
Prelude Numeric.Container randomVector 10 Gaussian 10
fromList Bus error (core dumped)

I attached gdb and found the bus error was happening in gsl_rng_alloc() but
some investigation indicate that the problem is probably due to this bug in
ghci: http://hackage.haskell.org/trac/ghc/ticket/2912 which has been marked
as a duplicate of http://hackage.haskell.org/trac/ghc/ticket/781 and a
recent update to 781 indicates that it won't be addressed until at least v
7.6.1 (781 also references http://hackage.haskell.org/trac/ghc/ticket/3658 and
it seems that this is a pretty large piece of work - moving to fully
dynamically linked ghci which has been around for a while and pushed back a
few times).

Does anyone know of a workaround that would allow ghci to use wrapped gsl
functionality on x86_64 systems in the meantime? Most linux boxes used by
quants are x86_64 now, so this issue will impact many people who would like
to use Haskell instead of Matlab.

Thanks
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] library-profiling default

2011-08-04 Thread Tom Doris
Hi
Is there a good reason that the default for library-profiling in
.cabal/config is set to False? It seems a lot of people hit the problem of
trying to profile for the first time, finding it doesn't work because
profiling libraries haven't been installed, then they have to walk the
dependencies reinstalling everything.

Is there a major cost or problem with just defaulting this to True?

Apologies if this is answered elsewhere, I saw various discussions on why it
is difficult to automatically build required libs with profiling on demand,
but nothing that discussed changing the default so that they are always
built.
Tom
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Data.Judy

2010-10-12 Thread Tom Doris
Hi,
Are there any plans to extent the current Data.Judy package to include
bindings to JudySL and JudyHS? There's a standalone binding to JudySL by
Andrew Choi that is usable but it would of course be better to have the
functionality in the Data.Judy package proper.
Thanks
Tom
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] chart broken under 6.12 according to criterion

2010-07-01 Thread Tom Doris
According to the criterion.cabal file shipped with the latest (0.5.0.1)
version of criterion, the Chart package is broken under GHC 6.12:

flag Chart
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: chart broken under 6.12 according to criterion

2010-07-01 Thread Tom Doris
 According to the criterion.cabal file shipped with the latest (0.5.0.1)
version of criterion, the Chart package is broken under GHC 6.12:

flag Chart
   description: enable use of the Chart package
   -- Broken under GHC 6.12 so far

Does anyone know the status of this problem? It's been a little frustrating
getting Criterion up and running - it didn't work at all under 6.10 due to a
compiler bug (The impossible happened error on uvector install) and now it
works under 6.12 but without the nice charts that are so useful. Appreciate
any insight or workarounds for this, thanks

(Apologies, previous email sent prematurely!)
Tom


On 1 July 2010 10:16, Tom Doris tomdo...@gmail.com wrote:

 According to the criterion.cabal file shipped with the latest (0.5.0.1)
 version of criterion, the Chart package is broken under GHC 6.12:

 flag Chart



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe