Yeah, one nice property of this is that all the underlying objects play nicely with byte-streams [1], so it's trivial to do something like:
--- (def s (vertigo.core/wrap type "some-huge-file")) (let [indices (sort-by #(vertigo.core/get-in s [% :foo :bar]) (range (count s))] (doseq [idx indices] (byte-streams/transfer (nth s idx) sorted-file {:append true}))) --- If there's framing information we won't be able to do this quite as efficiently, but it shouldn't be too much more expensive. [1] https://github.com/ztellman/byte-streams On Mon, Jul 15, 2013 at 9:48 PM, kovas boguta <kovas.bog...@gmail.com>wrote: > Interesting. This seems like a pretty promising direction for the bottom > of the big-data stack. > > A use case on my mind is sorting a big list of datastructures by key (or > some set of keys/paths) . > > Once the data gets big, you need to do an external sort, which means tons > of serialization round trips if implemented naively. Being able to just > pluck out the values you need really helps in that case. Besides saving on > the serialization overhead, it also cuts down on memory which means you can > sort much bigger segments at a time, and complete the overall sort in fewer > passes. > > > > > > > > On Mon, Jul 15, 2013 at 1:40 PM, Zach Tellman <ztell...@gmail.com> wrote: > >> If you (vertigo.core/wrap "a-file-name"), that will use mmap under the >> covers, so if no one's tried it, it's easy enough to start. >> >> With respect to non-fixed data layouts, that could be supported by a >> library which parses the framing information, and then layers Vertigo atop >> the actual data. In effect, that's what Gloss [1] is going to become, so >> keep watching the skies. >> >> Zach >> >> [1] https://github.com/ztellman/gloss >> >> >> On Sun, Jul 14, 2013 at 9:16 PM, kovas boguta <kovas.bog...@gmail.com>wrote: >> >>> This is pretty neat. >>> >>> Anyone try using this in conjunction with mmap? >>> >>> It would be nice to have some way to deal with strings & other >>> variable-length data. >>> >>> I'm also curious if its possible to make the analog of this for >>> fressian, basically to avoid unpacking objects that are not necessary for >>> the computation at hand. >>> >>> >>> >>> >>> >>> >>> On Tue, Jul 9, 2013 at 8:56 PM, Zach Tellman <ztell...@gmail.com> wrote: >>> >>>> Last year, I gave a talk at the Conj on my attempt to write an AI for >>>> the board game Go. Two things I discovered is that it was hard to get >>>> predictable performance, but even once I made sure I had all the right type >>>> hints, there was still a lot of room at the bottom for performance >>>> improvements. Towards the end [1], I mentioned a few ideas for >>>> improvements, one of which was simply using ByteBuffers rather than objects >>>> to host the data. This would remove all the levels of indirection, giving >>>> much better cache coherency, and also allow for fast unsynchronized >>>> mutability when the situation called for it. >>>> >>>> So, ten months and several supporting libraries [2] [3] later, here it >>>> is: https://github.com/ztellman/vertigo >>>> >>>> At a high level, this library is useful whenever your datatype has a >>>> fixed layout and is used more than once. Depending on your type, it will >>>> give you moderate to large memory savings, and if you're willing to forgo >>>> some of core library in favor of Vertigo's operators, you can get >>>> significant performance gains on batch operations. And, in the cases where >>>> performance doesn't matter, it will behave exactly like any other Clojure >>>> data structure. >>>> >>>> I want to point out that something like this would be more or less >>>> impossible in Java; reading from an offset in a ByteBuffer without the >>>> compile-time inference and validation provided by this library would be >>>> pointlessly risky. There's not a lot of low-level Clojure libraries, but >>>> there's an increasing amount of production usage where people are using >>>> Clojure for performance-sensitive work. I'm looking forward to seeing what >>>> people do with Vertigo and libraries like it. >>>> >>>> Zach >>>> >>>> [1] >>>> http://www.youtube.com/watch?feature=player_detailpage&v=v5dYE0CMmHQ#t=1828s >>>> [2] https://github.com/ztellman/primitive-math >>>> [3] https://github.com/ztellman/byte-streams >>>> >>>> -- >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "Clojure" group. >>>> To post to this group, send email to clojure@googlegroups.com >>>> Note that posts from new members are moderated - please be patient with >>>> your first post. >>>> To unsubscribe from this group, send email to >>>> clojure+unsubscr...@googlegroups.com >>>> For more options, visit this group at >>>> http://groups.google.com/group/clojure?hl=en >>>> --- >>>> You received this message because you are subscribed to the Google >>>> Groups "Clojure" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to clojure+unsubscr...@googlegroups.com. >>>> >>>> For more options, visit https://groups.google.com/groups/opt_out. >>>> >>>> >>>> >>> >>> -- >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "Clojure" group. >>> To post to this group, send email to clojure@googlegroups.com >>> Note that posts from new members are moderated - please be patient with >>> your first post. >>> To unsubscribe from this group, send email to >>> clojure+unsubscr...@googlegroups.com >>> For more options, visit this group at >>> http://groups.google.com/group/clojure?hl=en >>> --- >>> You received this message because you are subscribed to a topic in the >>> Google Groups "Clojure" group. >>> To unsubscribe from this topic, visit >>> https://groups.google.com/d/topic/clojure/BayfuaqMzvs/unsubscribe. >>> To unsubscribe from this group and all its topics, send an email to >>> clojure+unsubscr...@googlegroups.com. >>> >>> For more options, visit https://groups.google.com/groups/opt_out. >>> >>> >>> >> >> -- >> -- >> You received this message because you are subscribed to the Google >> Groups "Clojure" group. >> To post to this group, send email to clojure@googlegroups.com >> Note that posts from new members are moderated - please be patient with >> your first post. >> To unsubscribe from this group, send email to >> clojure+unsubscr...@googlegroups.com >> For more options, visit this group at >> http://groups.google.com/group/clojure?hl=en >> --- >> You received this message because you are subscribed to the Google Groups >> "Clojure" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to clojure+unsubscr...@googlegroups.com. >> For more options, visit https://groups.google.com/groups/opt_out. >> >> >> > > -- > -- > You received this message because you are subscribed to the Google > Groups "Clojure" group. > To post to this group, send email to clojure@googlegroups.com > Note that posts from new members are moderated - please be patient with > your first post. > To unsubscribe from this group, send email to > clojure+unsubscr...@googlegroups.com > For more options, visit this group at > http://groups.google.com/group/clojure?hl=en > --- > You received this message because you are subscribed to a topic in the > Google Groups "Clojure" group. > To unsubscribe from this topic, visit > https://groups.google.com/d/topic/clojure/BayfuaqMzvs/unsubscribe. > To unsubscribe from this group and all its topics, send an email to > clojure+unsubscr...@googlegroups.com. > For more options, visit https://groups.google.com/groups/opt_out. > > > -- -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups "Clojure" group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.