Re: Functions using locks are slowed even when locks are never taken
Verified that I receive the same result patterns as you on my machine. One other possibility outside of what you have already mentioned would be something JIT-related. Perhaps there is an optimization that can be performed if the locking sections of the code are in another function but otherwise no, but I'm not sure how that dovetails with Clojure' fn compilation. On Friday, November 1, 2013 11:53:12 AM UTC-7, Michael Blume wrote: > > Since 1.6 alpha is out, I reran the tests with that -- basically the same > results. > > On Friday, November 1, 2013 11:34:15 AM UTC-7, Michael Blume wrote: >> >> https://github.com/MichaelBlume/perf-test >> >> (ns perf-test >> (:use (criterium core)) >> (:gen-class)) >> >> (def o (Object.)) >> >> (defn foo [x] >> (if (> x 0) >> (inc x) >> (do >> (monitor-enter o) >> (let [res (dec x)] >> (monitor-exit 0) >> res >> >> (defn bar [x] >> (if (> x 0) >> (inc x) >> (dec x))) >> >> (defn locking-part [x l] >> (monitor-enter l) >> (let [res (dec x)] >> (monitor-exit l) >> res)) >> >> (defn baz [x] >> (if (> x 0) >> (inc x) >> (locking-part x o))) >> >> (defn -main [] >> (println "benching foo") >> (bench (foo 5) :verbose) >> (println "benching bar") >> (bench (bar 5) :verbose) >> (println "benching baz") >> (bench (baz 5) :verbose) >> (println "done benching")) >> >> >> >> I'm only ever calling these functions with positive values, so the >> monitor-enter branch should never be entered. Nevertheless, the performance >> of foo is much worse than bar or baz. >> >> The best guess I've got is that the fact that lock-taking is involved >> somehow changes how the function is compiled, somehow making the function >> slower. If the practical upshot is that I shouldn't write functions that >> only sometimes lock -- that the locking part of a function should always be >> its own function -- then I can do that, but I'm curious why. >> >> $ lein uberjar >> Compiling perf-test >> Created /Users/mike/perf-test/target/perf-test-0.1.0-SNAPSHOT.jar >> Created /Users/mike/perf-test/target/perf-test-0.1.0-SNAPSHOT-standalone.jar >> $ java -jar -server target/perf-test-0.1.0-SNAPSHOT-standalone.jar >> benching foo >> WARNING: Final GC required 1.5974571326266802 % of runtime >> x86_64 Mac OS X 10.8.3 4 cpu(s) >> Java HotSpot(TM) 64-Bit Server VM 24.0-b28 >> Runtime arguments: >> Evaluation count : 391582560 in 60 samples of 6526376 calls. >> Execution time sample mean : 167.426696 ns >> Execution time mean : 167.459429 ns >> Execution time sample std-deviation : 4.079466 ns >> Execution time std-deviation : 4.097819 ns >>Execution time lower quantile : 160.742869 ns ( 2.5%) >>Execution time upper quantile : 175.453376 ns (97.5%) >>Overhead used : 1.634996 ns >> >> Found 2 outliers in 60 samples (3. %) >> low-severe 2 (3. %) >> Variance from outliers : 12.5602 % Variance is moderately inflated by >> outliers >> benching bar >> x86_64 Mac OS X 10.8.3 4 cpu(s) >> Java HotSpot(TM) 64-Bit Server VM 24.0-b28 >> Runtime arguments: >> Evaluation count : 2174037300 in 60 samples of 36233955 calls. >> Execution time sample mean : 26.068923 ns >> Execution time mean : 26.066422 ns >> Execution time sample std-deviation : 0.887937 ns >> Execution time std-deviation : 0.916861 ns >>Execution time lower quantile : 23.996763 ns ( 2.5%) >>Execution time upper quantile : 27.911936 ns (97.5%) >>Overhead used : 1.634996 ns >> >> Found 3 outliers in 60 samples (5. %) >> low-severe 1 (1.6667 %) >> low-mild 1 (1.6667 %) >> high-mild1 (1.6667 %) >> Variance from outliers : 22.1874 % Variance is moderately inflated by >> outliers >> benching baz >> x86_64 Mac OS X 10.8.3 4 cpu(s) >> Java HotSpot(TM) 64-Bit Server VM 24.0-b28 >> Runtime arguments: >> Evaluation count : 2270676660 in 60 samples of 37844611 calls. >> Execution time sample mean : 25.834142 ns >> Execution time mean : 25.837429 ns >> Execution time sample std-deviation : 0.718382 ns >> Execution time std-deviation : 0.729431 ns >>Execution time lower quantile : 24.837925 ns ( 2.5%) >>Execution time upper quantile : 27.595781 ns (97.5%) >>Overhead used : 1.634996 ns >> >> Found 4 outliers in 60 samples (6.6667 %) >> low-severe 2 (3. %) >> low-mild 2 (3. %) >> Variance from outliers : 15.7591 % Variance is moderately inflated by >> outliers >> done benching >> >> -- -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visi
Re: Question regarding structural sharing, records, and maps
On Jul 29, 4:38 am, Ken Wesson wrote: > Yes. In fact this should work with records, too -- all the "native" > fields of the record need to be copied for each version, but most will > typically be pointers and most of those will typically be to existing > objects. Only the ones that were changed will point to new nested > objects. So unless your records have huge numbers of predefined fields > there shouldn't be much time or memory cost in copying them. Thanks for clearing that up. I wasn't exactly sure how the memory model of records worked. The semantics that records provide seem like a good fit for applications like this. > Probably. Any two changes to the tree will cause one to retry in the > single-ref case, unless you're sure all the changes will commute. > (Changes to distinct fields where one change doesn't depend on what > the other changes will commute, and changes to the same field that > don't matter as to sequencing, such as inc and dec, will commute, but > otherwise no.) Yeah, tree updates are pretty much strictly non-commutative. The retry overhead will probably grow quite fast with the number of concurrent updates. > > You will probably want something structured more like a database, with > "tables" that are maps and "rows" that are records or more complex, > nested structures, such that most concurrent changes will not affect a > common row. The "tables" can then be refs of maps of refs, with most > changes operating on the row-refs. The table refs only need to change > if rows are inserted or deleted; those could be bottlenecks, though if > all objects have GUIDs as keys in these tables then row insertions and > deletions will commute, leaving only the GUID generators for each > table. I think I understand the structure here, but just to be clear, you're proposing flattening of the tree structure from something like this: (ref { :contents (some stuff), :child-node1 { :contents (other things) :child-node1 ... }, :child-node2 { :contents (thingy1 thingy2) ... } ... }) into something like this? (ref { :some-guid (ref {<<< could be a map, record, or what have you :contents (some stuff) :child-node1 :another-guid :child-node2 :and-another-guid }) :another-guid (ref { :contents (other things) ... }) :and-another-guid (ref { :contents (thingy1 thingy2) ... }) } Interesting. I feel like I should have thought of this since I've done something similar in more object oriented code, but riddled with lots of fine grained locking. Here if updating one row references another the transaction is going to handle the snapshotting. I begin to see... > > Unfortunately, (let [new-id (commute table-x-guid-counter inc)]) > sounds nice but might result in new-id taking on the same value in two > concurrent transactions, so you'll need to use alter and GUID > generation may be a bottleneck. But it may be much faster than your > other, more complex transactions. Be sure to do it separately, e.g.: > > (let [new-id (dosync (alter table-x-guid-counter inc))] > (dosync > (commute table-x assoc new-id some-thingy))) > > You can even use atoms for the guid counters instead of refs, and then > the ! in swap! will remind you to get the ID before entering a > transaction. Do as much of what's needed to generate some-thingy > before the transaction, too -- though if it depends on other refs' > values, and those values should be current rather than it being > acceptable for them to be values that were there recently but might > have changed, then some of the work will need to be done in the > transaction. > > > I've considered wrapping each node in > > a ref and performing updates that way, or possibly taking the async > > route and dispatching cals using agents, but my experience there is > > nil. Any insight there? > > I'd avoid having refs nesting more than two deep. Complex arrangements > of cross-references among nodes can create a giant headache in > combination with refs and transactions, so using a "database" approach > with "table" maps, ID token keys, and cross-references to ID tokens > rather than objects is often sensible. In particular, complex objects > with nested refs effectively have mutable state inside them, which is > usually not desired in Clojure. (One can argue, though, that the ref > object is not different in principle from the ID token and removes a > layer of indirection, and possibly obviates the need to generate IDs > from counters that act as bottlenecks -- though, really, the Java new > operator's allocation of memory now takes the role of ID generator. > I'm given to understand modern VMs do some clever things to make new > able to be fairly concurrent, though, such as giving each thread its > own subset of the eden generation to allocate in.) Makes sense. I need to look into generating GUIDs quickly in a concurrent manner for other parts of the application anyways. I'm sure if this become
Re: Question regarding structural sharing, records, and maps
On Jul 29, 8:58 am, Michael Gardner wrote: > > An atom would be more appropriate, unless you're coordinating updates to the > tree with updates to some other ref. Ah, for some reason that didn't occur to me, thanks. Since the tree in this case acts as a supporting index on spatial data, it pretty much defers to the outcome of any updates to the original entities. -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Question regarding structural sharing, records, and maps
Hi all, this is my first post. I'm currently building a tree data structure (quadtree to be exact) while doing some exploratory game server programming using Clojure. This tree will end up with a lot of "entities" stored in the nodes, whose positions determine which node they end up in. As game state progresses, updates will come from player and AI controlled entities which need to be reflected in the tree. Since player updates will come from any number of external clients, the tree will need some form of concurrency management. So far, I have settled on just using a single ref wrapping the root node of the tree with all update functions acting on that ref via dosync. My question is one of implementation, as I don't yet have a good feeling how different approaches to storing data will affect memory usage and performance in the long run and I'm trying to better my intuition for designing applications with Clojure. I am currently using maps to represent the nodes of the tree, with the hypothesis being that new versions of the tree will save memory and do less thrashing this way. I'm aware that Clojure's persistent data structures have some method of structural sharing between different states of the same identity, which is what leads me to that guess. I haven't found anything that conclusively says that structures defined with defrecord do not perform this sort of sharing, but from what I can tell it seems like they do not, which makes sense given that they provide accessors and expose a Java class. So, some trade-off questions would be: 1.) Does structural sharing play well with nested structures? With a tree whose nodes are represented by nested maps, if a leaf node is updated with new data, will structural sharing efficiently represent the new version of the tree as "all the rest of the tree" + modified node? 2.) If one were to use nested records instead, I assume producing a new version of the tree will result in a cloning operation over the nodes of the tree, less the nodes we are modifying ourselves. If updates were being performed very frequently, is there a possibility of adverse interaction with a GC? I know that GC implementations differ, but it seems like this could put a lot of pressure on a typical GC. 3.) Are there alternatives to a single ref around the root node of the tree that are worth exploring? I've considered wrapping each node in a ref and performing updates that way, or possibly taking the async route and dispatching cals using agents, but my experience there is nil. Any insight there? As an aside, I am not looking for specific answers as without numbers or hard data there really can't be any right now. I will end up measuring the performance of these approaches, but I do want to have some understanding of how the variables are related when trying to understand the empirical data. Thanks, Trent -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en