Re: [racket-users] Re: appending files

2016-01-27 Thread George Neuner

On 1/27/2016 10:50 AM, Brandon Thomas wrote:

On Wed, 2016-01-27 at 04:01 -0500, George Neuner wrote:
> On Tue, 26 Jan 2016 23:00:01 -0500, Brandon Thomas
>  wrote:
>
> > Is there anything stopping you from restructuring
> > the data on disk and using the hash directly from there
>
> Scotty's hash table is much larger than he thinks it is and very
> likely is being paged to disk already.  Deliberately implementing it
> as a disk file is unlikely to improve anything.

That's a good point to keep in mind. But there are advantages,
including faster startup time, using less ram+swap, easier to keep the
file updated and it makes it easier to make a resize solution. There
are probably more, but basically it's the reasons why all large
(key,value) storage solutions I've herd of use an explicit file instead
of swap.


I miscalculated the scale of Scotty's hash structure - it's not as bad 
as I thought initially.  But even so, it is of a scale where it is 
unwieldy and bound to have virtual memory problems unless the machine is 
dedicated.


Hashing is latency sensitive - it was designed to be a memory resident 
technique.  Obviously it _can_ be done using file based buckets ... the 
effect is of querying an ISAM database for ever access.  The problem is 
that the latency increases by orders of magnitude: even from resident 
blocks, every access involves the file system API and the kernel.  You 
did mention (user space) caching, but that greatly complicates the solution.


Making the hash external I think is not a win - it definitely will 
handle much bigger files, but it will handle every file more slowly.  I 
think it is better to leverage the file system rather than fight it.


YMMV,
George

--
You received this message because you are subscribed to the Google Groups "Racket 
Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [racket-users] Re: appending files

2016-01-27 Thread George Neuner

On 1/27/2016 9:49 AM, scott coston wrote:

yes, me neither (11pm for me). i just rechecked my spreadsheet and i typed in 
128 where i should have typed 256. but it doesn't really matter as the my test 
file is small compared to where the file will be going.


Mine was worse - it's downright embarrassing. :-[When I was running 
the numbers somehow I put in  6e10  as the the scale - 60 billion 
instead of the 6 million you specified - and I didn't realize it until 
after I had sent the reply.


Your analysis of the hash size is quite reasonable (modulo allocator 
overhead).   I do think, though, that you can save considerable space 
with a byte array vs pointer array + bignums.




anyway, since you seem to have some ideas i have more info for you. take a look 
at this:
https://www.aaai.org/Papers/AAAI/1996/AAAI96-178.pdf


I'll take a look at it.

I haven't had to figure out an external merge filter for decades. It can 
be tricky when the number of files needs to be variable.  And also as I 
mentioned, the fact that you want it sorted on the non-filtered field is 
a complication.





From: racket-users@googlegroups.com  on behalf of 
George Neuner 
Sent: Wednesday, January 27, 2016 4:28 AM
To: racket-users@googlegroups.com
Subject: Re: [racket-users] Re: appending files

Sorry.   I shouldn't do math at 4am. Ignore the numbers.   However, it
is still correct that the byte array will use less space than an array
of bignums.
George



--
You received this message because you are subscribed to the Google Groups "Racket 
Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [racket-users] Re: appending files

2016-01-27 Thread Brandon Thomas
On Wed, 2016-01-27 at 17:49 -0500, George Neuner wrote:
> On 1/27/2016 10:50 AM, Brandon Thomas wrote:
> > On Wed, 2016-01-27 at 04:01 -0500, George Neuner wrote:
> > > On Tue, 26 Jan 2016 23:00:01 -0500, Brandon Thomas
> > >  wrote:
> > > 
> > > > Is there anything stopping you from restructuring
> > > > the data on disk and using the hash directly from there
> > > 
> > > Scotty's hash table is much larger than he thinks it is and very
> > > likely is being paged to disk already.  Deliberately implementing
> > > it
> > > as a disk file is unlikely to improve anything.
> > 
> > That's a good point to keep in mind. But there are advantages,
> > including faster startup time, using less ram+swap, easier to keep
> > the
> > file updated and it makes it easier to make a resize solution.
> > There
> > are probably more, but basically it's the reasons why all large
> > (key,value) storage solutions I've herd of use an explicit file
> > instead
> > of swap.
> 
> I miscalculated the scale of Scotty's hash structure - it's not as
> bad 
> as I thought initially.  But even so, it is of a scale where it is 
> unwieldy and bound to have virtual memory problems unless the machine
> is 
> dedicated.
> 
> Hashing is latency sensitive - it was designed to be a memory
> resident 
> technique.  Obviously it _can_ be done using file based buckets ...
> the 
> effect is of querying an ISAM database for ever access.  The problem
> is 
> that the latency increases by orders of magnitude: even from
> resident 
> blocks, every access involves the file system API and the
> kernel.  You 
> did mention (user space) caching, but that greatly complicates the
> solution.
> Making the hash external I think is not a win - it definitely will 
> handle much bigger files, but it will handle every file more
> slowly.  I 
> think it is better to leverage the file system rather than fight it.
> 
> YMMV,
> George

Yup, the canonical solutions to storing large amounts of data scalably
are rather complicated and are work to implement (I wouldn't say
unmanagable though). And they get much harder if you wanted, say, ACID
properties. This is why it's been factored into libraries, like Redis
for example. If it can fit into ram efficiently enough, that's ok. If
he's planning to scale though, maybe it's worth looking at using
Racket's ffi with a library like Redis or something. It might be the
least work and the cleanest solution.

Regards,
Brandon Thomas

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[racket-users] Re: appending files

2016-01-27 Thread George Neuner
On Wed, 27 Jan 2016 11:17:04 -0800 (PST), Scotty C
 wrote:

>On Wednesday, January 27, 2016 at 2:57:42 AM UTC-6, gneuner2 wrote:
>
>> What is this other field on which the file is sorted?
>this field is the cost in operators to arrive at the key value

Is it important to retain that sorting?  Or is it just informational?

>> In the worst case of every key being a bignum
>no, every key is contained within a bignum which can contain many many keys.

Then you're not using the hash in a conventional manner ... else the
filter entries would be unique ... and we really have no clue what
you're actually doing.  So any suggestions we give you are shots in
the dark.


>> Since you are only comparing the hash entries for equality, you could
>> save a lot of space [at the expense of complexity] by defining a
>> {bucket x chain_size x 16} byte array and storing the 16-byte keys
>> directly.
>i must be able to grow the chains. i can't make it fixed size like that.

You said previously:

>>> ... i use separate chaining with a vector of
>>> bignums. i am willing to let my chains run up to 5 keys per chain
>>> so i need a vector of 6 million pointers.

That reads as needing a maximum of  6M x 5  entries.  

Exstensible structures don't have to be a list - a small array works
very well when the number of entries is bounded [which you said it
was].

Again this is a case of we really have no clue what you are doing.


>> > have another rather large bignum in memory that i use to reduce
>> >but not eliminate record duplication of about .5 gb. 
>> ???
>
>ha, ok. this is what this bignum is for. cycle elimination. a sequence
>of operators (2 bit per) when strung together is a number like 4126740
>which represents the operator sequence (0 0 3 0 0 3 1 1 2 2 1). i
>change that bit in the bignum from 0 to 1. during data generation i 
>look up my about to be applied operator sequence in the bignum. if i 
>see a one, i skip data generation. i'm not really happy with the volume
>of memory this takes but it is an insanely fast lookup and keeps a ton of
>data off the hard drive.


>> In the meantime, I would suggest you look up "merge sort" and it's
>logarithmic? not happening

By splitting the too-large data file and trying to process it in
sections, you're already O(M*N) where M is the number of sections.
Logarithmic might improve things.

Rejecting an approach due solely to its BigO is naive.  The order
tells how it scales to the limits ... not how it behaves in any
particular situation.  In most cases there are constants and other
linear multipliers that dominate the running time.

It matter not just how many times you touch the data, but also the
cost of each touch.

E.g., people read that Quicksort is O(N*LogN)  and ignore that it is
O(N*N) under certain circumstances.  Mergesort has a higher constant
multiplier, but it is  O(N*LogN)  in _all_ cases ... including those
that throw Quicksort into fits.  Plus Quicksort can't be used
effectively when the data set exceeds RAM.  Mergesort and some others
can be used regardless of the data set size.

Also remember that it's the pattern of data access that's important -
not what is actually being done with it.

Even when data fits in memory, random access algorithms often are not
the best.  Prediction, caching and paging concerns often make the most
effective algorithm require complex rearrangement and staging of data
that is unnatural wrt the overall approach to the problem.

Disk file systems are optimized for sequential access.  When data is
file based, sequential access is to be preferred unless there's no
other choice.  

George

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[racket-users] Re: Playing sound

2016-01-27 Thread Lehi Toskin
On Wednesday, January 27, 2016 at 11:41:06 AM UTC-8, Jordan Johnson wrote:
> Hi all,
> 
> 
> I'm looking at audio again because I have a student doing a game project for 
> which he wants background music to play — and I'm feeling some trepidation 
> based on most of the audio-tagged packages on pkgs.racket-lang.org showing 
> some test failures or other build problems.
> 
> 
> His minimal goal is to loop a sound file from the start of play until the 
> end, and have it work on Windows and Mac. We tried play-sound from 
> racket/gui/base, and it almost fits: we just need to be able to kill the 
> sound at the end, and it seems that killing the sound-playing thread is not 
> enough. Is there a way to stop the audio?
> 
> 
> His ideal goal is to be able to play different background music depending on 
> the room the player is in. AFAICT this is beyond play-sound's ability, at 
> least if he wants to do smooth transitions. Is there another library which 
> would facilitate cross-fading from one audio file to another?
> 
> 
> More generally, what are currently our best tools for incorporating music and 
> sound effects into Racket games?
> 
> Best,
> Jordan

This may be larger than the scope of the project, but there is an audio library 
called RSound, which uses PortAudio (via a wrapper). It's more robust and I'm 
certain there are things you could do with it that you couldn't with play-sound.

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[racket-users] lazy vs strict, raco vs racket

2016-01-27 Thread Linh Chi Nguyen
Hello, so i have this racket code to run simulation, and I can run it on sorta 
a cluster (the lab utilises the computer classroom at night and turns it into a 
cluster - as they call it).

The thing is when i run on the cluster, it doesn't accept any kind of GUI, so i 
have to switch from (require plot) into (require plot/no-gui), even if i don't 
use any "plot" in the process. I mean i just print out some data and the "plot" 
function (though appear in the code script) isn't used at all. This error 
doesn't happen when i run on my laptop.

Does that mean that when i run the code on laptop, it's a lazy evaluation, and 
on the cluster, it's a strict evaluation?

Second point, what is raco vs racket? For example, when this line "racket -t 
file.rkt" is used, it runs the main module inside the file. And when this line 
"raco test -s test-name file.rkt" it runs a test module inside the file.

However, the former line will print out data in the current directory, the 
latter print out data in the code script directory (if these two directories 
are different). So why is that? 

>From a non-programmer perspective,
chi

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [racket-users] Re: appending files

2016-01-27 Thread Brandon Thomas
On Wed, 2016-01-27 at 04:01 -0500, George Neuner wrote:
> On Tue, 26 Jan 2016 23:00:01 -0500, Brandon Thomas
>  wrote:
> 
> > Is there anything stopping you from restructuring
> > the data on disk and using the hash directly from there
> 
> Scotty's hash table is much larger than he thinks it is and very
> likely is being paged to disk already.  Deliberately implementing it
> as a disk file is unlikely to improve anything.
> 
> George
> 

That's a good point to keep in mind. But there are advantages,
including faster startup time, using less ram+swap, easier to keep the
file updated and it makes it easier to make a resize solution. There
are probably more, but basically it's the reasons why all large
(key,value) storage solutions I've herd of use an explicit file instead
of swap.

Regards,
Bradon Thomas

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [racket-users] Re: appending files

2016-01-27 Thread Brandon Thomas
On Tue, 2016-01-26 at 22:48 -0800, Scotty C wrote:
> ok brandon, that's a thought. build the hash on the hard drive at the
> time of data creation. you mention collision resolution. so let me
> build my hash on the hard drive using my 6 million buckets but
> increase the size of each bucket from 5 slots to 20. right?
> i can't exactly recreate my vector/bignum hash on the hard drive
> because i can't dynamically resize the buckets like i can the
> bignums. this gives me a 4 gb file whereas my original was 1 gb. i
> have enough space for that so that's not a problem.

Sure, a bigger file works. Your bignums are probably implemented using
linked lists, if I'm not mistaken. That's why they are O(1) resizable.
You could do the same thing on hdd. The problem is that you might need
to seek and read through the file more with linked lists. Since your
probably paging, your implementation also has this same problem. You
can get rid of this problem in both by using vectos though, as you
suggest. And you can have a proccess in the background that resizes the
file (by creating a new one) when your other one gets too full, so you
don't block operations.

> so as my buckets fill up they head towards the average of 5 data
> items per bucket. so on average here's what happens with each hd hash
> record. i go to my hd hash and read 3.5 (think about it) items and
> 90% of the time i don't find my data so i do a write. in my process i
> do an initial write, then a read, a write, a read, a write. compare:
> 3.5 vs 2 reads; 1 vs 3 writes. the reads are more costly and if i
> exceed 20 items in a bucket the hd hash breaks. what do you think? is
> it worth it?
> 

I think that implementation has room for improvement. Let's say you
look for a record, it does not exist, and you need to write it. So, you
seek to the location, read the bucket into ram (1 read). If the entry
isn't in the bucket, you seek to the location in the bucket and write
your entry (1 write). Of course to do this, you'd think you'd need to
make a function for your hash that searches for an entry, and uses a
fallback to write the record if it fails. That's a reasonable function
though. But you can also introduce the idea of using a cache, and have
a background proccess update the file (that way your hahs oeprations
aren't waiting on the hdd). Though, your operating systems cache file
cache might even be taking care of this for you, so you might want to
do some benchmarking.

There are probably several other optimizations you can make, and you
can probably find them by going through the source/docs of other
(key,value) storage solutions. Or if you wanted to cut down on work,
you can just use one of the other (key,value) storage solutions and use
it via libffi.

Regards,
Brandon Thomas

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[racket-users] list of macro & #lang resources

2016-01-27 Thread Matthew Butterick
For an upcoming project I'm assembling a list of articles & videos about making 
macros and #langs in Racket. These are the ones I've come across, but I welcome 
suggestions of others I've missed.

Culpepper & Alvis: Basic Macrology 
http://rmculpepper.github.io/malr/basic.html

Felleisen: Racket Is ...
http://www.ccs.neu.edu/home/matthias/Thoughts/Racket_is.html

Flatt: DSL Embedding
https://www.youtube.com/watch?v=WQGh_NemRy4

Flatt: Creating Languages in Racket
http://delivery.acm.org/10.1145/207/2063195/p48-flatt.pdf

Flatt: Building Languages in Racket
https://www.youtube.com/watch?v=y1rOWZkALto

Flatt: Macros, Modules, and Languages
https://www.youtube.com/watch?v=Z4qn9NFfb9s

Hendershott: Fear of Macros
http://www.greghendershott.com/fear-of-macros/

Yoo: Fudging Up a Racket
http://www.hashcollision.org/brainfudge/

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [racket-users] Implement Topological sorting in HtDP-y way?

2016-01-27 Thread Matthias Felleisen

On Jan 19, 2016, at 3:41 AM, pvt  wrote:

> I just read the chapter of graph traversing from HtDP's website:
> 
> http://www.ccs.neu.edu/home/matthias/HtDP2e/part_five.html#%28part._fsm._sec~3atraverse-graph1%29
> 
> The implementation of graph traversing is the calling sequence of:
> `find-path` -> `find-path/list` -> `find-path` -> ...
> 
> I must say this is the best DFS algorithm implementation  I've ever found.
> 
> I was wondering if maybe we could implement the topological sorting algorithm 
> in this HtDP-y way? Or maybe is that not necessary?
> 
> I found an elegant implementation in which the core function `visit` call 
> itself:
> https://github.com/carl-eastlund/mischief/blob/master/mischief/sort.rkt


Thanks for the entertaining question. 

If you go to Wikipedia, you find a graph and several suggested topological 
sorts. 
You also find a natural description of an algorithm for DAGs: 

 -- find vertices w/o incoming edges 
 -- list those
 -- remove them from the graph 
 -- repeat until graph is empty 

This clearly suggests a generative recursion. If you represent graphs as list 
of list of nodes plus outgoing connections, you can see that you reverse the 
algorithm and of course you get a reversed topological sorted list back. You 
can use reverse or you can throw in an accumulator and get the result directly. 

This is how a person thinks who has absorbed HtDP completely. 

;; A Directed Graph 
;; Graph = [Listof [NEListof Node]]
;; Node = N
;; constraint: every non-first node on any list must start some other list
;; interpretation:
;; each list represents the source node and
;; to which destination nodes it connects

(define the-graph
  '((5 11)
(11 2 9 10)
(2)
(7 11 8)
(8 9)
(9)
(3 8 10)
(10)))

;; -
;; Graph -> [Listof Node]
;; generative recursion with accumulator
;; remove nodes without outgoing edges and accumulate tho

(check-member-of (topological-sort the-graph)
 '(5 7 3 11 8 2 9 10) ; (visual left-to-right, top-to-bottom)
 '(3 5 7 8 11 2 9 10) ; (smallest-numbered available vertex 
first)
 '(5 7 3 8 11 10 9 2) ; (fewest edges first)
 '(7 5 11 3 10 8 9 2) ; (largest-numbered available vertex 
first)
 '(5 7 11 2 3 8 9 10) ; (attempting top-to-bottom, 
left-to-right)
 '(3 7 5 8 11 10 9 2) ; the one that we actually find 
 '(3 7 8 5 11 10 2 9)) ; (arbitrary]
 
(define (topological-sort G0)
  (local (;; Graph [Listof Node] -> [Listof Node]
  ;; accumulator: done are topologically sorted nodes between G and G0
  (define (topological-sort G done)
(cond
  [(empty? G) done]
  [else
   (local ((define next (no-out-going-edges G))
   (define G-next (foldr remove-node G next)))
 (topological-sort G-next (append next done)))])))
(topological-sort G0 '(

;; -
;; Graph -> [Listof Node]
;; find vertices w/o outgoing edges 

(check-expect (no-out-going-edges the-graph) '(2 9 10))

(define (no-out-going-edges G)
  (map first (filter (lambda (c) (empty? (rest c))) G)))

;; -
;; Node Graph -> Graph
;; remove the node from this graph

(define the-graph-without-9
  '((5 11)
(11 2 10)
(2)
(7 11 8)
(8)
(3 8 10)
(10)))

(check-expect (remove-node 9 the-graph) the-graph-without-9)

(define (remove-node n G)
  (local (;; [Listof Node] -> [Listof Node]
  (define (remove-connectivity c)
(remove-all n c)))
(filter cons? (map remove-connectivity G


EXERCISE Design a function is-sorted-topologically?. It consumes a list of 
nodes and a (directed acyclic) graph. Its result is #true if the given list is 
sorted topologically. Otherwise it produces #false. 

Once you have designed the function, use it to simplify the test case for 
topological sort. []

 

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[racket-users] Odd behavior of syntax-case

2016-01-27 Thread reilithion
I was trying to write a macro today (I'm not too good at writing macros yet) 
and I ran up against a wall when I realized that syntax-case wasn't behaving 
the way I expected. I boiled down the behavior to the following test case:

#lang racket/base

(require (for-syntax racket/base))

;; Intended behavior:
;; Break the world unless save-the-world is
;; ANYWHERE in the list of arguments
(define-syntax (break-the-world stx)
  (syntax-case stx (save-the-world)
[(_ trash-left ... save-the-world . trash-right)
 #'(begin (display 'hooray) (newline))]
[_ #'(error "The world is broken!")]))

(break-the-world save-the-world)
(break-the-world 'trash1 save-the-world)
(break-the-world save-the-world 'trash2)
(break-the-world 'trash1 save-the-world 'trash2)

The first two expansions of break-the-world result in "hooray". The last two, 
on the other hand, break the world. I gather that 'trash2 is not being matched 
by the ". trash-right" part of my pattern. What I don't understand is why, 
especially when this kind of thing seems to work if I use syntax-parse instead.

Is there a more idiomatic way of matching a pattern anywhere in a list?

Thanks!
Lucas Paul

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [racket-users] list of macro & #lang resources

2016-01-27 Thread Benjamin Greenman
The Racket Manifesto
http://www.ccs.neu.edu/home/matthias/manifesto/

On Wed, Jan 27, 2016 at 11:31 AM, Matthew Butterick  wrote:

> For an upcoming project I'm assembling a list of articles & videos about
> making macros and #langs in Racket. These are the ones I've come across,
> but I welcome suggestions of others I've missed.
>
> Culpepper & Alvis: Basic Macrology
> http://rmculpepper.github.io/malr/basic.html
>
> Felleisen: Racket Is ...
> http://www.ccs.neu.edu/home/matthias/Thoughts/Racket_is.html
>
> Flatt: DSL Embedding
> https://www.youtube.com/watch?v=WQGh_NemRy4
>
> Flatt: Creating Languages in Racket
> http://delivery.acm.org/10.1145/207/2063195/p48-flatt.pdf
>
> Flatt: Building Languages in Racket
> https://www.youtube.com/watch?v=y1rOWZkALto
>
> Flatt: Macros, Modules, and Languages
> https://www.youtube.com/watch?v=Z4qn9NFfb9s
>
> Hendershott: Fear of Macros
> http://www.greghendershott.com/fear-of-macros/
>
> Yoo: Fudging Up a Racket
> http://www.hashcollision.org/brainfudge/
>
> --
> You received this message because you are subscribed to the Google Groups
> "Racket Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to racket-users+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[racket-users] Re: appending files

2016-01-27 Thread Scotty C
On Wednesday, January 27, 2016 at 2:57:42 AM UTC-6, gneuner2 wrote:

> What is this other field on which the file is sorted?
this field is the cost in operators to arrive at the key value

> WRT a set of duplicates: are you throwing away all duplicates? Keeping
> the 1st one encountered?  Something else?
keep first instance, chuck the rest

> This structure uses a lot more space than necessary.  Where did you
> get the idea that a bignum is 10 bytes?
not sure about the 10 bytes. if i shove 5 128 bit keys into a bignum is that 
about 80 bytes plus some overhead? so 80*600 is 480 mb not including 
overhead.

> In the worst case of every key being a bignum
no, every key is contained within a bignum which can contain many many keys.

> Since you are only comparing the hash entries for equality, you could
> save a lot of space [at the expense of complexity] by defining a
> {bucket x chain_size x 16} byte array and storing the 16-byte keys
> directly.
i must be able to grow the chains. i can't make it fixed size like that.

> > have another rather large bignum in memory that i use to reduce
> >but not eliminate record duplication of about .5 gb. 
> ???

ha, ok. this is what this bignum is for. cycle elimination. a sequence of 
operators (2 bit per) when strung together is a number like 4126740 which 
represents the operator sequence (0 0 3 0 0 3 1 1 2 2 1). i change that bit in 
the bignum from 0 to 1. during data generation i look up my about to be applied 
operator sequence in the bignum. if i see a one, i skip data generation. i'm 
not really happy with the volume of memory this takes but it is an insanely 
fast lookup and keeps a ton of data off the hard drive.

> In the meantime, I would suggest you look up "merge sort" and it's
logarithmic? not happening

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [racket-users] list of macro & #lang resources

2016-01-27 Thread Alex Knauth
Racket: Metaprogramming Time! by Matthew Flatt
http://www.infoq.com/presentations/racket

Helped me a lot in understanding how phases work. I don't see that on your list.

> On Jan 27, 2016, at 1:07 PM, Benjamin Greenman  
> wrote:
> 
> The Racket Manifesto
> http://www.ccs.neu.edu/home/matthias/manifesto/ 
> 
> 
> On Wed, Jan 27, 2016 at 11:31 AM, Matthew Butterick  > wrote:
> For an upcoming project I'm assembling a list of articles & videos about 
> making macros and #langs in Racket. These are the ones I've come across, but 
> I welcome suggestions of others I've missed.
> 
> Culpepper & Alvis: Basic Macrology
> http://rmculpepper.github.io/malr/basic.html 
> 
> 
> Felleisen: Racket Is ...
> http://www.ccs.neu.edu/home/matthias/Thoughts/Racket_is.html 
> 
> 
> Flatt: DSL Embedding
> https://www.youtube.com/watch?v=WQGh_NemRy4 
> 
> 
> Flatt: Creating Languages in Racket
> http://delivery.acm.org/10.1145/207/2063195/p48-flatt.pdf 
> 
> 
> Flatt: Building Languages in Racket
> https://www.youtube.com/watch?v=y1rOWZkALto 
> 
> 
> Flatt: Macros, Modules, and Languages
> https://www.youtube.com/watch?v=Z4qn9NFfb9s 
> 
> 
> Hendershott: Fear of Macros
> http://www.greghendershott.com/fear-of-macros/ 
> 
> 
> Yoo: Fudging Up a Racket
> http://www.hashcollision.org/brainfudge/ 
> 


-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[racket-users] Playing sound

2016-01-27 Thread Jordan Johnson
Hi all,

I'm looking at audio again because I have a student doing a game project for 
which he wants background music to play — and I'm feeling some trepidation 
based on most of the audio-tagged packages on pkgs.racket-lang.org showing some 
test failures or other build problems.

His minimal goal is to loop a sound file from the start of play until the end, 
and have it work on Windows and Mac. We tried play-sound from racket/gui/base, 
and it almost fits: we just need to be able to kill the sound at the end, and 
it seems that killing the sound-playing thread is not enough. Is there a way to 
stop the audio?

His ideal goal is to be able to play different background music depending on 
the room the player is in. AFAICT this is beyond play-sound's ability, at least 
if he wants to do smooth transitions. Is there another library which would 
facilitate cross-fading from one audio file to another?

More generally, what are currently our best tools for incorporating music and 
sound effects into Racket games?

Best,
Jordan

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[racket-users] Re: appending files

2016-01-27 Thread Scotty C
> Is it important to retain that sorting?  Or is it just informational?
it's important

> Then you're not using the hash in a conventional manner ... else the
> filter entries would be unique ... and we really have no clue what
> you're actually doing.  So any suggestions we give you are shots in
> the dark.
using it conventionally? absolutely. it is a hash with separate chaining. will 
a bit of code help?

(define (put p)
  (define kh (hashfn p))
  (vector-set! tble kh (bitwise-ior (arithmetic-shift (vector-ref tble kh) 
shftl) p)))

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [racket-users] lazy vs strict, raco vs racket

2016-01-27 Thread Jay McCarthy
On Wed, Jan 27, 2016 at 3:26 AM, Linh Chi Nguyen
 wrote:
> Hello, so i have this racket code to run simulation, and I can run it on 
> sorta a cluster (the lab utilises the computer classroom at night and turns 
> it into a cluster - as they call it).
>
> The thing is when i run on the cluster, it doesn't accept any kind of GUI, so 
> i have to switch from (require plot) into (require plot/no-gui), even if i 
> don't use any "plot" in the process. I mean i just print out some data and 
> the "plot" function (though appear in the code script) isn't used at all. 
> This error doesn't happen when i run on my laptop.
>
> Does that mean that when i run the code on laptop, it's a lazy evaluation, 
> and on the cluster, it's a strict evaluation?

No, lazy vs strict is not what is going on.

When you `(require X)` then the code at the top-level of X is run. So
if the module were something like

```
(module X racket/base (printf "Hi!\n"))
```

then just doing the `require` will cause `printf` to run and display something.

When you run `(require plot)` it eventually does a `(require
racket/gui)` which has a line that is something like
`(initialize-the-GUI-system!)` at the top-level. You can't run that
line without, for instance X11, so it would fail. But `(require
plot/no-gui)` makes sure to never require something with that at the
top-level.

> Second point, what is raco vs racket? For example, when this line "racket -t 
> file.rkt" is used, it runs the main module inside the file. And when this 
> line "raco test -s test-name file.rkt" it runs a test module inside the file.
>
> However, the former line will print out data in the current directory, the 
> latter print out data in the code script directory (if these two directories 
> are different). So why is that?

`racket` is the binary for the virtual machine that can load Racket
programs and run them.

`raco` is a particular Racket program that has a bunch of sub-commands
implementing tools inside for Racket. If you run `raco --help` it will
give a list of all the sub-commands. On my machine, I get this list:

```
Usage: raco   ...  ...

Frequently used commands:
  docs search and view documentation
  make compile source to bytecode
  setupinstall and build libraries and documentation
  pkg  manage packages
  planet   manage Planet package installations
  covera code coverage tool
  exe  create executable
  test run tests associated with files/directories

All available commands:
  check-requires   check for useless requires
  contract-profile profile overhead from contracts
  covera code coverage tool
  ctoolcompile and link C-based extensions
  decompiledecompile bytecode
  demodularize produce a whole program from a single module
  dependencies-graph   opens a GUI window showing transitive module
dependencies (aka `Module Browser')
  distribute   prepare executable(s) in a directory for distribution
  docs search and view documentation
  exe  create executable
  expand   macro-expand source
  fc   change directory to given collection's location
  link manage library-collection directories
  make compile source to bytecode
  pack pack files/collections into a .plt archive
  pkg  manage packages
  planet   manage Planet package installations
  profile  profile execution time
  puresuri run a puresuri slideshow
  read read and pretty-print source
  scribble render a Scribble document
  setupinstall and build libraries and documentation
  show-dependenciesshow module dependencies
  test run tests associated with files/directories
  unpack   unpack files/collections from a .plt archive

A command can be specified by an unambiguous prefix.
See `raco help ' for help on a command.
```

You can think of `raco` as a Racket-specific set of executables
similar to /bin, /usr/bin, /usr/local/bin, etc but where all the
commands are implemented in Racket and are managed by the Racket
environment.

The particular command that you mention, `raco test`, is a program for
running Racket programs and seeing if they have tests and if those
tests are successful.

Jay

> From a non-programmer perspective,
> chi
>
> --
> You received this message because you are subscribed to the Google Groups 
> "Racket Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to racket-users+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.



-- 
Jay McCarthy
Associate Professor
PLT @ CS @ UMass Lowell
http://jeapostrophe.github.io

   "Wherefore, be not weary 

[racket-users] Re: appending files

2016-01-27 Thread George Neuner

Hi Scotty,

I rearranged your message a bit for (my own) clarity.


On Tue, 26 Jan 2016 18:40:28 -0800 (PST), Scotty C
 wrote:

>running 64 bit linux mint OS on a 2 core laptop with 2 gb of ram. 

>the generated keys are random but i use one of the associated 
>fields for sorting during the initial write to the hard drive. what goes 
>in each of those files is totally random but dupes do not run across
>files. also, the number of keys is >1e25.

>my key is 128 bits with ~256 bits per record. so my 1 gb file contains
>~63 million records and ~32 million keys. about 8% will be dupes 
>leaving me with ~30 million keys. 

You have 63 million 256-bit records in a 1GB file?  The description of
your "records" is vague and can be interpreted more than 1 way, but my
math says:

  256 bits (128 key + 128 data) = 32M records
  384 bits (128 key + 256 data) = ~22M records
 

Anyway ... I'll assume for filtering only the keys are important.

If I understand correctly, you have a 128-bit "key space" and that you
expect to see some small percentage of the keys duplicated.   The file
is sorted on some other non-key value.

What is this other field on which the file is sorted?

WRT a set of duplicates: are you throwing away all duplicates? Keeping
the 1st one encountered?  Something else?


>i run a custom built hash. i use separate chaining with a vector of
>bignums. i am willing to let my chains run up to 5 keys per chain
>so i need a vector of 6 million pointers. that's 48 mb for the array.
>another 480 mb for the bignums. let's round that sum to .5 gb. 

This structure uses a lot more space than necessary.  Where did you
get the idea that a bignum is 10 bytes?

Unless something changed recently, Racket uses the GMP library for
multiple precision math.  GMP bignums require 2 allocations: a struct
plus an array.  Repesenting a full size 128-bit value on a 64-bit
machine takes 40-48 bytes depending on the library's int size, and not
including any allocator overhead.

In the worst case of every key being a bignum, your hash is using 2.6
to 3.1 GB ... not 0.5 GB.  Again, not including allocator overhead.

Since you are only comparing the hash entries for equality, you could
save a lot of space [at the expense of complexity] by defining a
{bucket x chain_size x 16} byte array and storing the 16-byte keys
directly.  

Even using an auxillary vector for tracking residency in the array,
you'd be down under 1GB for the same 6 million hash entries.

Even so, this seems like a bad approach to me.


> have another rather large bignum in memory that i use to reduce
>but not eliminate record duplication of about .5 gb. 

???

>i'm attempting to get this thing to run in 2 places so i need 2 hashes.
>add this up .5+.5+.5 is 1.5 gb and that gets me to about my memory 
>limit. 


I don't have time just now to sketch a split/merge solution ... I'll
get back to it later.  The file already being sorted on an unfiltered
field is a complication.  Offhand I don't think the sort order can be
maintained through the filtering ... meaning the filtered file would
need to be sorted again afterward.  

But even so it might be faster than your current method.

In the meantime, I would suggest you look up "merge sort" and it's
application to "external" sorting.  That will give you the basic idea
of what I'm proposing.

George

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [racket-users] Re: appending files

2016-01-27 Thread George Neuner
Sorry.   I shouldn't do math at 4am. Ignore the numbers.   However, it 
is still correct that the byte array will use less space than an array 
of bignums.

George


On 1/27/2016 3:54 AM, George Neuner wrote:

i run a custom built hash. i use separate chaining with a vector of
bignums. i am willing to let my chains run up to 5 keys per chain
so i need a vector of 6 million pointers. that's 48 mb for the array.
another 480 mb for the bignums. let's round that sum to .5 gb.

This structure uses a lot more space than necessary.  Where did you
get the idea that a bignum is 10 bytes?

Unless something changed recently, Racket uses the GMP library for
multiple precision math.  GMP bignums require 2 allocations: a struct
plus an array.  Repesenting a full size 128-bit value on a 64-bit
machine takes 40-48 bytes depending on the library's int size, and not
including any allocator overhead.

In the worst case of every key being a bignum, your hash is using 2.6
to 3.1 GB ... not 0.5 GB.  Again, not including allocator overhead.

Since you are only comparing the hash entries for equality, you could
save a lot of space [at the expense of complexity] by defining a
{bucket x chain_size x 16} byte array and storing the 16-byte keys
directly.

Even using an auxillary vector for tracking residency in the array,
you'd be down under 1GB for the same 6 million hash entries.

Even so, this seems like a bad approach to me.



--
You received this message because you are subscribed to the Google Groups "Racket 
Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[racket-users] Re: appending files

2016-01-27 Thread George Neuner
On Tue, 26 Jan 2016 23:00:01 -0500, Brandon Thomas
 wrote:

>Is there anything stopping you from restructuring
>the data on disk and using the hash directly from there

Scotty's hash table is much larger than he thinks it is and very
likely is being paged to disk already.  Deliberately implementing it
as a disk file is unlikely to improve anything.

George

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [racket-users] lazy vs strict, raco vs racket

2016-01-27 Thread chi

Oh thank you so much for your answer.
It helps a lot,
good day,
chi

On 27/01/2016 13:29, Jay McCarthy wrote:

On Wed, Jan 27, 2016 at 3:26 AM, Linh Chi Nguyen
 wrote:

Hello, so i have this racket code to run simulation, and I can run it on sorta 
a cluster (the lab utilises the computer classroom at night and turns it into a 
cluster - as they call it).

The thing is when i run on the cluster, it doesn't accept any kind of GUI, so i have to switch from 
(require plot) into (require plot/no-gui), even if i don't use any "plot" in the process. 
I mean i just print out some data and the "plot" function (though appear in the code 
script) isn't used at all. This error doesn't happen when i run on my laptop.

Does that mean that when i run the code on laptop, it's a lazy evaluation, and 
on the cluster, it's a strict evaluation?

No, lazy vs strict is not what is going on.

When you `(require X)` then the code at the top-level of X is run. So
if the module were something like

```
(module X racket/base (printf "Hi!\n"))
```

then just doing the `require` will cause `printf` to run and display something.

When you run `(require plot)` it eventually does a `(require
racket/gui)` which has a line that is something like
`(initialize-the-GUI-system!)` at the top-level. You can't run that
line without, for instance X11, so it would fail. But `(require
plot/no-gui)` makes sure to never require something with that at the
top-level.


Second point, what is raco vs racket? For example, when this line "racket -t file.rkt" is 
used, it runs the main module inside the file. And when this line "raco test -s test-name 
file.rkt" it runs a test module inside the file.

However, the former line will print out data in the current directory, the 
latter print out data in the code script directory (if these two directories 
are different). So why is that?

`racket` is the binary for the virtual machine that can load Racket
programs and run them.

`raco` is a particular Racket program that has a bunch of sub-commands
implementing tools inside for Racket. If you run `raco --help` it will
give a list of all the sub-commands. On my machine, I get this list:

```
Usage: raco   ...  ...

Frequently used commands:
   docs search and view documentation
   make compile source to bytecode
   setupinstall and build libraries and documentation
   pkg  manage packages
   planet   manage Planet package installations
   covera code coverage tool
   exe  create executable
   test run tests associated with files/directories

All available commands:
   check-requires   check for useless requires
   contract-profile profile overhead from contracts
   covera code coverage tool
   ctoolcompile and link C-based extensions
   decompiledecompile bytecode
   demodularize produce a whole program from a single module
   dependencies-graph   opens a GUI window showing transitive module
dependencies (aka `Module Browser')
   distribute   prepare executable(s) in a directory for distribution
   docs search and view documentation
   exe  create executable
   expand   macro-expand source
   fc   change directory to given collection's location
   link manage library-collection directories
   make compile source to bytecode
   pack pack files/collections into a .plt archive
   pkg  manage packages
   planet   manage Planet package installations
   profile  profile execution time
   puresuri run a puresuri slideshow
   read read and pretty-print source
   scribble render a Scribble document
   setupinstall and build libraries and documentation
   show-dependenciesshow module dependencies
   test run tests associated with files/directories
   unpack   unpack files/collections from a .plt archive

A command can be specified by an unambiguous prefix.
See `raco help ' for help on a command.
```

You can think of `raco` as a Racket-specific set of executables
similar to /bin, /usr/bin, /usr/local/bin, etc but where all the
commands are implemented in Racket and are managed by the Racket
environment.

The particular command that you mention, `raco test`, is a program for
running Racket programs and seeing if they have tests and if those
tests are successful.

Jay


 From a non-programmer perspective,
chi

--
You received this message because you are subscribed to the Google Groups "Racket 
Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.





--
You received this message