Re: [Haskell-cafe] Big Arrays
Thanks for the response. That sounds sequence comparison seems very impressive On Wed, Oct 6, 2010 at 2:23 PM, Ketil Malde ke...@malde.org wrote: Hemanth Kapila saihema...@gmail.com writes: Let us say, we are using a bit-array of size 2^43 (that is, a byte array of size 2^40) to store a bloom filter. And let us further assume that we are interested in a false-positive probability of 0.01 Since we are just making up numbers, let us instead say we are using a bit array of size 2^32 - still too large for Int indexing (which was the issue here) but only 500MB in size. That means, I will be able to use this array to represent a set of cardinality 9.18e11 ~ 10^12 ...bringing it down to less than 10^9, easily reached for building an indexing of k-words (tuples) in e.g. the human genome (3GB). But: I'm about to start analyzing Illumina sequencing data, where we have sequences from two individuals of the same species. I'm interesting in the differences between these species, so I might want to index both data sets and screen the sequences of each against the other to identify bits that don't appear in the other. Since I'll have easily tens, maybe a hundred gigabytes of sequence from each, it's not impossible to get into the territory you describe. (In practice, things will be limited by available RAM, sadly still some orders of magnitude less than 2^40 - although apparently SGI can deliver it. I guess I just need a bigger budget.) I was curious to know what sort of programs would be dealing with sets of 10^12 elements. Most likely a program using mutation, and probably not copying, multi-generation GC. Other data sets that have considerable size are acoustics data (I understand a modern sonar deliver about a Gbit/sec continously), and seismic data. Also, for more moderate bodies of data and more complex analysis, you might want to use less frugal data structure, like suffix arrays. -k -- If I haven't seen further, it is by standing in the footprints of giants ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Lazy evaluation from Why Functional programming matters
Hi, Let us try to rewrite the code in a more java-esque syntax: It translates to something like the below generic method. Correct? static T T function(IBoundsCheckT within, DeltaT eps, IteratorT iterator, T initValue){ T currVal = initVal; while(iterator.hasNext()){ T nextVal = iterator.next(); if(within.verify(delta, eps, currVal, nextVal)) return currVal; currVal = nextVal } } I have not tested it but I think this is a fair translation of the code. (For instance, by using an appropriate implementation of IBoundsCheck, I will be able to implement the 'relativeSqrt' functionality of the example). But this IS still a lazy evaluation. By passing an iterator instead of a list as the third argument of the static method, I achieved 'laziness'. In the example, the laziness is in the way we are iterating over the sequence of values [a0,f(a0), f(f(a0)),...] and so on and not on when the runtime evaluates appropriate values. Just that having to write, (repeat (next N) a0) is (take 1000 (repeat 1)) times more intuitive and convenient than having to implement the Iterator for T or implementing a true-while loop. /Hemanth K On Tue, Oct 5, 2010 at 4:50 PM, C K Kashyap ckkash...@gmail.com wrote: Hi All, I was going through the paper's lazy evaluation section where the square root example is given. It occurred to me that one could implement it in a modular way with just higher order functions (without the need for lazy evaluation that is). function f (within, eps, next, a0){ while(true){ a1=next(a0); if(within(a0,a1,eps)return a0; a0=a1; } } Is this not the case? -- Regards, Kashyap ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Lazy evaluation from Why Functional programming matters
I see ... I think I understand now. hmmm ... I am little disappointed though - does that mean that all the laziness cool stuffs can actually be done using iterators(generators)? As in, but for the inconvenient syntax, you can do it all in - say java? Yes. It would slightly easier in, say, C# or C++. I think 'D' achieves its implementation of the 'lazy' keyword using a similar approach. But I did not understand why you are disappointed ? ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Big Arrays
Thanks Ketil. On Wed, Oct 6, 2010 at 1:36 AM, Ketil Malde ke...@malde.org wrote: Just out of curiosity, may I know a use case of such huge arrays? Bloom filters? Am probably dense - I still did not get an idea of where would one use such a big array. Let us say, we are using a bit-array of size 2^43 (that is, a byte array of size 2^40) to store a bloom filter. And let us further assume that we are interested in a false-positive probability of 0.01 That means, I will be able to use this array to represent a set of cardinality 9.18e11 ~ 10^12 I was curious to know what sort of programs would be dealing with sets of 10^12 elements. Am mainly curious about how one would decide that bloom filters are the best algorithm when dealing with this amount of data.What factors do we consider when deciding the algorithm or the data structure? The impact on GC would be taken into account, for example (am guessing there would be at least one copy from a younger generation to a permanent generation)? /Hemanth K ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: Re[2]: [Haskell-cafe] Big Arrays
Just out of curiosity, may I know a use case of such huge arrays? At such sizes, I thought, the array would not have the expected array properties (constant access time) due to thrashing. thanks, Hemanth ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Embedded scripting Language for haskell app
Hi, Can some one please give me a suggestion on the best choice for an embedded scripting Language for a haskell application? I mean, something like guile/lua for c/c++ and groovy/jruby for java. For quite some time, I've been using a lisp-like interpreter that I implemented myself. But this is not going too well - going by this road, I suspect I will end up with a mule. I am looking for a pony (a declarative programming language). I am okay with a donkey too. baskell[1] seems interesting. And there's hslua[2]. Can one use hint[3] like this ? Thanks Hemanth K [1] baskell - http://hackage.haskell.org/package/baskell [2] hslua - http://hackage.haskell.org/package/hslua [3] hint - http://hackage.haskell.org/package/hint ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Embedded scripting Language for haskell app
thanks to all for the responses. I tried hint and hslua and haskell and both are very nice. My sincere thanks and kudos to the developers and maintainers of the packages (and to Bulat for the tutorial on hslua). FWIW, a couple of observations: 1. haskell indeed makes a great scripting language (as expected) 2. hint is fast* , for the examples that I ran on. (I had this 'superstition' that haskell , when interpreted, is kinda slow). I was about to toss a coin to decide which one to pickup. Perhaps I should worry about the size. Thanks Hemanth ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] unboxed arrays restricted to simple types (Int, Float, ..)
On Wed, Nov 11, 2009 at 5:28 PM, Tillmann Vogt tillmann.v...@rwth-aachen.de wrote: Hi, I tried to use unboxed arrays for generating an antialiased texture. To make it easier to understand, here is the stripped down code that produces an error: import Control.Monad.ST import Data.Array.ST import Data.Array.Unboxed import Data.Word type BitMask = UArray Int Word16 -- for determining the grey value of a pixel type Pixels = (Int, Int, T) data T = N | B BitMask -- this does not work -- type T = Int -- this works if int the next line N is replaced by ..lets say 0 f = newArray (0,10) N :: (ST s (STUArray s Int T)) http://hackage.haskell.org/packages/archive/array/0.2.0.0/doc/html/Data-Array-MArray.html#t%3AMArray shows that mutable/unboxed arrays only allow simple types: i.e. MArray (STUArray s) Int32 (ST s) Isn't this ugly? Imagine this would be the case in C: struct stupidArrayElement{ int a; int b; // not allowed! } stupidArrayElement s[10]; Wouldn't it be nice to have something like: MArray (STUArray s) e (ST s) with e being a non-recursive data type (like data T = N | B Bitmask). My understanding of Haskell isn't deep enough to know if I have overlooked something or if the problem is solvable without a language extension. With a language extension I guess that it is not hard to find out if an abstract data type is non-recursive. Then this type should be serializable automatically. What do you think? ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe Actually, there's a cool package called storable record. Could it be of some use to you? (Perhaps you *might* be able to use it if the BitMasks are of uniform length). Am not 100% sure though. Isn't this ugly? I am not sure if it is really *ugly*...and if am allowed to nit-pick, the analogy with C is not appropriate either. Arrays are just different. (At least thats how I console myself, when am looking for a high performance strict array). Also, on an approximately related issue, I was suggested to look into data parallel arrays. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Storable Vector as a Storable record?
Hi, Is it possible to somehow make a StorableVector of a StorableVector via store-record or something? If yes, could some one please provide me with some hint? I need a very fast and efficient array of a large number of $ arrays of *Int *s. And the storableVector seems to be extremely nice. Am trying to understand the way a tuple is made a storable record in the synthesizer package on hackage but I thought I will ask it here too. Thanks in advance Hemanth K ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Suggestions for simulating Object ID
Hello all, Can some one please suggest something on simulating some sort of object-Ids in Haskell? I mean, something like the following relation will hold: a1 == a2 = (objectID a1) == (objectID a2) Currently I have something like import Data.Hash import qualified Data.ByteString.Char8 as B newtype UniqueID = UniqueID (Hash, B.ByteString) I have a type class called Identifiable, the instances of which are supposed to give a unqiue bytestring representation of the instance which can be arbitrarily long but should satisfy the condition mentioned earlier. class Identifiable a where instanceID :: a - B.ByteString and finally, the required funtion objectID is defined as : objectID :: (Hashable a, Identifiable a) = a - UniqueID objectID a1 = UniqueID (hash a1, instanceID a1) while comparing object Ids, I compare the hash values first (which can be faster for arbitrary values). Only if they are equal then I compare the instanceIDs and count on laziness for avoiding the construction of the bytestrings where they are not needed. Can you please suggest a better approach? I also toyed with the idea of storing all the data 'objects' in an array and use the index as the identifier but the trouble is I am not aware of all the number of instances before hand under all the circumstances. Many thanks Hemanth K ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Leksah works!
Hi, Thanks for the update. I just tried it and it is quite cool.Atlhough I am too addicted to emacs and can't imagine leaving it (at least before hIDE), I still think leksah is quite useful. Thanks again On Tue, Jun 30, 2009 at 10:45 AM, Eugene Kirpichov ekirpic...@gmail.comwrote: Hi. Quite a while ago I launched Leksah and couldn't get anything done at all; so I thought it is probably never be completed and abandoned attempts to find an IDE for Haskell. However, 3 days ago I launched the new version and it works fantastic! It has an IntelliSense popup with type annotations, a module browser, build-on-the-fly and other things, even though I used it only for 15 minutes (then the ICFP contest began, where I wrote in Python and Java). Main point: It seems a vastly more convenient IDE for Haskell than vim (don't know about emacs-mode). So, I'd like to encourage haskellers to install it and give it a try :) -- Eugene Kirpichov Web IR developer, market.yandex.ru ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Suggestions for simulating Object ID
Thank you. It looks perfect for now. Just that I can create it only inside an IO (newUnique is of type IO Unique) Perhaps it involves some magic with random numbers or system time. Can't we come up with something like this staying within the limits of purity? Thanks Hemanth On 6/30/09, Felipe Lessa felipe.le...@gmail.com wrote: On Tue, Jun 30, 2009 at 03:25:26PM +0530, Hemanth Kapila wrote: Can some one please suggest something on simulating some sort of object-Ids in Haskell? Data.Unique[1]? [1] http://hackage.haskell.org/packages/archive/base/latest/doc/html/Data-Unique.html -- Felipe. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Ensuring Type Class instances follow the 'rules'
Hi all, Recently, I participated in a coding competition. As part of it, I had to write a program wherein I had to make my data-type an instance of Ord. An error in my implementation of compare resulted in me losing quite a bit of valuable time. As I wrote it, I had tested it out on the ghci and it seemed to work fine but I noticed that there was trouble when the List.sort started giving weird output. I then noticed that for certain instances (say t1,t2), both compare t1 t2 AND compare t2 t1 returned LT. Once I spotted it, it was easy enough to fix but I did lose an hour or so thinking something went wrong before I fed the list to the sort function. I was the only haskeller in the competition and I believe am among the first to finish a correct application (by a large margin, and) and I did manage to raise a bit of haskell-awareness, but if I had managed to spot the error earlier, the results would have been dazzling. Does any one here have any advice on dealing with such maladroitness on the part of the programmer, especially while creating instances to type-classes? Thanks and regards, Hemanth ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Ensuring Type Class instances follow the 'rules'
Thanks Eugene and wren. Serves me right, actually. The one chapter I skipped in RWH was the 11th one called Testing and Quality Assurance thinking it is too boring. On Fri, May 29, 2009 at 12:44 PM, wren ng thornton w...@freegeek.orgwrote: Eugene Kirpichov wrote: Use QuickCheck. Also use SmallCheck or lazy SmallCheck. All of these are automatic tools that allow you to specify laws[1] and automatically generate values for testing the law. QuickCheck generates values randomly, which can be useful but is very often insufficient. Both versions of SmallCheck do exhaustive search of all values down to a bounded depth (lazy SmallCheck can often search deeper and faster). If you want to have decent assurances of correctness then use (lazy) SmallCheck. If you're dealing with a really branchy type which makes it prohibitive to search to any reasonable depth, then use QuickCheck in addition. Occasionally QuickCheck can be good for ferreting out really obscure corner cases, but SmallCheck is much better about guaranteeing you get all the basic cases. [1] e.g. prop_compare a b = compare a b == negateOrdering (compare b a) where negateOrdering LT = GT negateOrdering EQ = EQ negateOrdering GT = LT -- Live well, ~wren ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe