Re: Call for masters thesis ideas (possibly related to Clojure)
I also vote for static analysis. People have combined static analysis with data mining to predict software defects. http://scholar.google.com/scholar?hl=enq=data+mining+software+defectsbtnG=Searchas_sdt=2000as_ylo=as_vis=0 However, it doesn't look like anyone's tried it with any functional languages. -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: Call for masters thesis ideas (possibly related to Clojure)
What about adding circular dependency resolution to the compiler between Clojure - Java code ? I know, I know - it's not very sexy but it would go a long way for companies with larger legacy Java code bases that may be considering moving to Clojure. Being able to replace Java code 'one step at a time' would be invaluable in that area and it also makes Clojure easier to 'sell'. On Dec 18, 4:35 am, Patrick Kristiansen patrick.kristian...@gmail.com wrote: Hi We're two students that have been working with concurrent programming languages (Erlang and Clojure), and while both languages are very interesting, we would like to work on something related to Clojure in our masters thesis. I once asked on #clojure for ideas, and Rich Hickey suggested looking into predicate dispatch and it is one idea that we are considering. We have also considered working on distributed Clojure, but I don't know if there is already an effort going on in that regard? Do you have any other suggestions? We'd be really thankful for any suggestions. Thanks in advance. -Patrick -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: Call for masters thesis ideas (possibly related to Clojure)
How about static analysis? Plenty of interesting problems there. For example: 1. detect incorrect uses of transients (those that would certainly lead to exceptions) and report them during compile-time. 2. related to (1), but different: detect places where the compiler can implicitly convert collections to transients and back, without losing correctness, and use that as a speed-up. 3. soft typing, either to speed up code or to detect incorrect code early. Such things have been done with Scheme. I believe Clojure has certain advantages when working on static analysis: * you can call the reader and macro expander during analysis, which greatly simplifies the language to be analyzed. * the core language is relatively simple * the compiler source is freely available and is written in a relatively high-level language (java and not c), so it (should) be relatively easy to get into. Relatedly, it might be interesting to create formal semantics for Clojure, and interface it with a proof assistant, enabling proofs to be written and checked about the behavior of clojure programs. In that context, one of the opportunities provided by lispiness is that macros can help write the proof, by automatically creating propositions about the code that they return. For example, (dotimes [i n] body) can be modified to attach these propositions to its output, using an appropriate proof language: - i=0 - i(the result of) n - the loop terminates if its body terminates That's just scratching the surface. That field is deep and wide, and a complete implementation is probably beyond the scope of a master's theses. Still, parts of it can be done. -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: Call for masters thesis ideas (possibly related to Clojure)
On Dec 19, 1:52 am, ajay gopalakrishnan ajgop...@gmail.com wrote: Put *Comparative performance evaluation of Java threads for embedded applications**: Linux thread vs. Green thread Your Google search skills are obviously beyond ours. :) I've found it now. -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: Call for masters thesis ideas (possibly related to Clojure)
Predicate dispatch would be an interesting topic. The #1 problem with predicate dispatch is method ordering. Predicate dispatch as described in the orignal paper [http://www.cs.washington.edu/homes/mernst/pubs/ dispatching-ecoop98-abstract.html] decides what order to used based on implication between predicates. For example: def foo(x) where x is Animal = A def foo(x) where x is Chicken = B def foo(x) where x is Bird = C If you call foo(new Crow) you want the method with where x is Bird to be called. A Crow is not a Chicken, so we can rule out the Chicken method. But a Crow is also an Animal, so how does the dispatch decide that it calls the Bird method and not the Animal method? The answer of the authors of the predicate dispatch paper is that you determine implication between predicates. If x is a Bird, that implies that x is an Animal. So we test if x is a Bird and only after that we consider the Animal method. This is good but unfortunately checking implication between predicates is undecidable. For example if we define this: def bar(n) where 5^n == 5 mod n = A def bar(n) where prime?(n) = B How is the static analysis going to determine that the second predicate implies the first? You can solve this case by requiring the programmer to provide implication information to the compiler, like this: implication prime?(n) = 5^n == 5 mod n As you can see this isn't nice. In my opinion we need a better way to do method ordering. I've been thinking about this and my proposal is to use the definition order in the source code to decide which methods to check first: try methods that are defined last first. def baz(x) where p(x) = A def baz(x) where q(x) = B def baz(x) where r(x) = C We first check for r(x) and if it's true we execute C. If it's false then we check if q(x) is true and execute B. If q(x) is false we check p(x) and execute A. The implementation of this kind of predicate dispatch is trivial. You don't need a complicated implication checker. It's just a simple macro. But the question is whether this simple method ordering rule works in practice. With this order you can do everything that you can do with a class hierarchy (in Java for example). It also subsumes Clojure's hierarchies by using a predicate this is itself a method. I think it's interesting to explore this further to see if this order still works well for more advanced uses of predicate dispatch, and to explore alternative orderings. Some other research ideas: - Functional reactive programming in Clojure. A problem with FRP is that you have to invert dependencies. Instead of specifying for each cause what its effects are, you have to specify for each effects what its possible causes are. Can you solve this? The problem is much easier if you don't try to prevent glitches as other FRP systems do. Is a system without glitch prevention useful, is introducing glitches for removing cause and effect inversion a good trade off? - Type/value inference with abstract interpretation. Can you write an abstact interpretation to infer possible values of a variable? Is this useful in creating a better compiler. Can the information be used to create better editors (intellisense?). - In a statically typed language like Haskell Scala you can write a generic sum function that sums a list of things. You cannot do this in any dynamic language that I know of. You want sum to work on any list of things that support +. Suppose that numbers support + and strings support + (maybe they shouldn't but bear with me for this example). sum ([1,2]) = 3. sum([ab,cd]) = abcd. Now comes the problem. What is sum([])? For numbers it should be 0, for strings it should be . In a statically typed language you can use the type of the list to determine which value to return. How can we write a generic sum function in a dynamic language? Jules On Dec 18, 1:35 pm, Patrick Kristiansen patrick.kristian...@gmail.com wrote: Hi We're two students that have been working with concurrent programming languages (Erlang and Clojure), and while both languages are very interesting, we would like to work on something related to Clojure in our masters thesis. I once asked on #clojure for ideas, and Rich Hickey suggested looking into predicate dispatch and it is one idea that we are considering. We have also considered working on distributed Clojure, but I don't know if there is already an effort going on in that regard? Do you have any other suggestions? We'd be really thankful for any suggestions. Thanks in advance. -Patrick -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: Call for masters thesis ideas (possibly related to Clojure)
I want to thank you all for your suggestions, the clojure community is really great! On Dec 18, 1:35 pm, Patrick Kristiansen patrick.kristian...@gmail.com wrote: Hi We're two students that have been working with concurrent programming languages (Erlang and Clojure), and while both languages are very interesting, we would like to work on something related to Clojure in our masters thesis. I once asked on #clojure for ideas, and Rich Hickey suggested looking into predicate dispatch and it is one idea that we are considering. We have also considered working on distributed Clojure, but I don't know if there is already an effort going on in that regard? Do you have any other suggestions? We'd be really thankful for any suggestions. Thanks in advance. -Patrick -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Call for masters thesis ideas (possibly related to Clojure)
Hi We're two students that have been working with concurrent programming languages (Erlang and Clojure), and while both languages are very interesting, we would like to work on something related to Clojure in our masters thesis. I once asked on #clojure for ideas, and Rich Hickey suggested looking into predicate dispatch and it is one idea that we are considering. We have also considered working on distributed Clojure, but I don't know if there is already an effort going on in that regard? Do you have any other suggestions? We'd be really thankful for any suggestions. Thanks in advance. -Patrick -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: Call for masters thesis ideas (possibly related to Clojure)
Hi, An idea I was interested in (more from an learning opportunity perspective than thesis) is provided more Persistent data structures. As of now, we have the the Array mapped Hash tree based structure that works well for Vectors, Maps and Graphs too. What I feel is missing is Tree based structures. Although, you can simulate them using Vectors as children indices, I think if we directly implement the Persistent Data structure of Binomial/Fibonacci Heaps and BSTs with all their operations (insert,delete,search,meld), it would be much faster. I could be wrong though! Anyways ... my 2 cents. Thanks, Ajay G. -- Forwarded message -- From: Patrick Kristiansen patrick.kristian...@gmail.com Date: Fri, Dec 18, 2009 at 7:35 AM Subject: Call for masters thesis ideas (possibly related to Clojure) To: Clojure clojure@googlegroups.com Hi We're two students that have been working with concurrent programming languages (Erlang and Clojure), and while both languages are very interesting, we would like to work on something related to Clojure in our masters thesis. I once asked on #clojure for ideas, and Rich Hickey suggested looking into predicate dispatch and it is one idea that we are considering. We have also considered working on distributed Clojure, but I don't know if there is already an effort going on in that regard? Do you have any other suggestions? We'd be really thankful for any suggestions. Thanks in advance. -Patrick -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.comclojure%2bunsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: Call for masters thesis ideas (possibly related to Clojure)
Most of the distributed Clojure work is based on running Clojure on a distributed JVM package. Here's a link on Clojure + Terracotta. I have heard mention of other solutions as well, I just don't recall them. http://groups.google.com/group/clojure/browse_thread/thread/9a9690947617906/6bc50fb45ca56e7d Sean On Dec 18, 7:35 am, Patrick Kristiansen patrick.kristian...@gmail.com wrote: Hi We're two students that have been working with concurrent programming languages (Erlang and Clojure), and while both languages are very interesting, we would like to work on something related to Clojure in our masters thesis. I once asked on #clojure for ideas, and Rich Hickey suggested looking into predicate dispatch and it is one idea that we are considering. We have also considered working on distributed Clojure, but I don't know if there is already an effort going on in that regard? Do you have any other suggestions? We'd be really thankful for any suggestions. Thanks in advance. -Patrick -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: Call for masters thesis ideas (possibly related to Clojure)
ajay gopalakrishnan ajgop...@gmail.com writes: Hi Ajay, An idea I was interested in (more from an learning opportunity perspective than thesis) is provided more Persistent data structures. As of now, we have the the Array mapped Hash tree based structure that works well for Vectors, Maps and Graphs too. In the last few month I didn't find any time to play around with clojure or follow discussions on this list, so could you give me a pointer to persistent graph structures? That would be an area where I would be utmost interested. Bye, Tassilo -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: Call for masters thesis ideas (possibly related to Clojure)
(0) Create a toolkit to run multiple parallel, tightly communicating clojure apps on google-app engine, simulating a single, long-running, multithreaded JVM instance that does not appear, to the user, to be limited by the constraints of GAE's java implementation (e.g. single threading, shared refs work despite being distributed); at the same time, the toolkit would minimize resources consumed in GAE, persisting threads/continuations waiting for data, and heuristically determining lowest cost for long-term storage versus memory and runtime consumption. Note: http://elhumidor.blogspot.com/2009/04/clojure-on-google-appengine.html THE BIG CAVEAT Two unusual aspects of the Google AppEngine environment create pretty major constraints on your ability to write idiomatic Clojure. First, an AppEngine application runs in a security context that doesn't permit spawning threads, so you won't be able to use Agents, the clojure.parallel library, or Futures. Second, one of the most exciting features of AppEngine is that your application will be deployed on Google's huge infrastructure, dynamically changing its footprint depending on demand. That means you'll potentially be running on many JVMs at once. Unfortunately this is a strange fit for Clojure's concurrency features, which are most useful when you have precise control over what lives on what JVM (and simplest when everything runs on one JVM). Since shared references (Vars, Refs, and Atoms) are shared only within a single JVM, they are not suitable for many of their typical uses when running on AppEngine. You should still use Clojure's atomic references (and their associated means of modification) for any state that it makes sense to keep global per-JVM, since there may be multiple threads serving requests in one JVM. But remember JVMs will come and go during the lifetime of your application, so anything truly global should go in the Datastore or Memcache. (1) a clojure implementation of Yahoo's PNUTs, using STM's and all the cool facilities clojure provides: http://research.yahoo.com/files/pnuts.pdf (interesting to have a writeup of a real-world impl alongside comparisons to Google Bigtable and Amazon Dynamo) We describe PNUTS, a massively parallel and geographically distributed database system for Yahoo!'s web applications. The foremost requirements of web applications are scalability, consistently good response time for geographically dispersed users, and high availability. At the same time, web applications can frequently tolerate relaxed consistency guarantees. For example, if a user changes an avatar ... little harm is done if the new avatar is not initially visible to one friend It is often acceptable to read (slightly) stale data, but occasionally stronger guarantees are required by applications. PNUTS provides a consistency model that is between the two extremes of general serializability and eventual consistency ... We provide per-record timeline consistency: all replicas of a given record apply all updates to the record in the same order The application [can] indicate cases where it can do with some relaxed consistency for higher performance [such as reading] a possibly stale version of the record. Some interesting commentary from http://glinden.blogspot.com/2009/02/details-on-yahoos-distributed-database.html http://glinden.blogspot.com/2009/02/details-on-yahoos-distributed-database.html .. * * *When reading the paper, a couple things about PNUTS struck me as surprising: First, the system is layered on top of the guarantees of a reliable pub-sub message broker which acts both as our replacement for a redo log and our replication mechanism. I have to wonder if the choice to not build these pieces of the database themselves could lead to missed opportunities for improving performance and efficiency. Second, as figures 3 and 4 show, the average latency of requests to their database seems quite high, roughly 100 ms. This is high enough that web applications probably would incur too much total latency if they made a few requests serially (e.g. ask for some data, then, depending on what the data looks like, ask for some other data). That seems like a problem. Please see also my August 2006 post, Google Bigtable paperhttp://glinden.blogspot.com/2006/08/google-bigtable-paper.html, which discusses the distributed database behind many products at Google. Please see also my earlier post, Highly available distributed hash store at Amazonhttp://glinden.blogspot.com/2007/10/highly-available-distributed-hash.html, on the distributed database behind some features at Amazon.com. Please see also my earlier posts, Cassandra data store at Facebookhttp://glinden.blogspot.com/2008/08/cassandra-data-store-at-facebook.html and HBase: A Google Bigtable clonehttp://glinden.blogspot.com/2007/07/hbase-google-bigtable-clone.html . Update: One of the developers of PNUTS
Re: Call for masters thesis ideas (possibly related to Clojure)
On 18 dec, 13:35, Patrick Kristiansen patrick.kristian...@gmail.com wrote: Hi We're two students that have been working with concurrent programming languages (Erlang and Clojure), and while both languages are very interesting, we would like to work on something related to Clojure in our masters thesis. I once asked on #clojure for ideas, and Rich Hickey suggested looking into predicate dispatch and it is one idea that we are considering. We have also considered working on distributed Clojure, but I don't know if there is already an effort going on in that regard? Do you have any other suggestions? We'd be really thankful for any suggestions. Erm, what's your master? I'll assume CS. Personally, I'm interested in whether complete thread abstraction that makes threads as light-weight as possible, but also the only way to do concurrency (like Erlang provides with its processes) is really the best way to model concurrent programs. I'm over 95% sure that native threads really are not the best way to model in-process concurrency for most programs, simply because of all the overhead that you incur especially when dealing with massively concurrent long- running thead-like-things - since you both know erlang, you will know what I'm talking about - that sort of approach really doesn't work in clojure, that's why clojure uses thread pools for agents etc. But maybe a new and lightweight in-kernel thread model is an interesing subject? Just asking :) Good luck with your thesis, Joost Diepenmaat. -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: Call for masters thesis ideas (possibly related to Clojure)
On Dec 18, 11:06 pm, Joost jo...@zeekat.nl wrote: Erm, what's your master? I'll assume CS. Well it's software engineering, but close enough :). I should have mentioned that. Personally, I'm interested in whether complete thread abstraction that makes threads as light-weight as possible, but also the only way to do concurrency (like Erlang provides with its processes) is really the best way to model concurrent programs. I'm over 95% sure that native threads really are not the best way to model in-process concurrency for most programs, simply because of all the overhead that you incur especially when dealing with massively concurrent long- running thead-like-things - since you both know erlang, you will know what I'm talking about - that sort of approach really doesn't work in clojure, that's why clojure uses thread pools for agents etc. But maybe a new and lightweight in-kernel thread model is an interesing subject? Just asking :) Thanks for your suggestion. It's an interesting idea, and we've been considering something along those lines. I don't know if Erlang's processes are even more lightweight than green threads. According to Wikipedia [1], Linux actually performs really well in terms of context switching OS threads compared to green threads. But I can't find the paper Wikipedia cites, and it seems not to be downloadable from ACM [2]. In any case, we will take this suggestion into consideration. [1] http://en.wikipedia.org/wiki/Green_threads#Performance [2] http://portal.acm.org/citation.cfm?id=603551 -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: Call for masters thesis ideas (possibly related to Clojure)
On 19 dec, 01:20, Patrick Kristiansen patrick.kristian...@gmail.com wrote: Thanks for your suggestion. It's an interesting idea, and we've been considering something along those lines. I don't know if Erlang's processes are even more lightweight than green threads. According to Wikipedia [1], Linux actually performs really well in terms of context switching OS threads compared to green threads. But I can't find the paper Wikipedia cites, and it seems not to be downloadable from ACM [2]. The problem seems to be two part: 1. Starting new native thread takes a relatively long time 2. Native threads means a fairly significant memory and scheduler consumption. In Java-based threads, that means for example that a 20,000 connection server is pretty much the maximum that's possible in that model. In Erland, 20,000 concurrent connections is what you should expect from an unoptimized system. This is at least partly due to kernel design, but also due to language design. A purely side- effect free language is much more capable of sharing data. In *that* regard, it's probably easier to optimize Erlang than Clojure. But in any case, massive networking is the key - In clojure, there is plenty of room to improve in that regard. In Erlang probably too, but erlang has no really interesing memory sharing mechanisms. -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: Call for masters thesis ideas (possibly related to Clojure)
Put *Comparative performance evaluation of Java threads for embedded applications**: Linux thread vs. Green thread Google, and click the second link that says [PDF]. It is for free. * On Fri, Dec 18, 2009 at 7:20 PM, Patrick Kristiansen patrick.kristian...@gmail.com wrote: On Dec 18, 11:06 pm, Joost jo...@zeekat.nl wrote: Erm, what's your master? I'll assume CS. Well it's software engineering, but close enough :). I should have mentioned that. Personally, I'm interested in whether complete thread abstraction that makes threads as light-weight as possible, but also the only way to do concurrency (like Erlang provides with its processes) is really the best way to model concurrent programs. I'm over 95% sure that native threads really are not the best way to model in-process concurrency for most programs, simply because of all the overhead that you incur especially when dealing with massively concurrent long- running thead-like-things - since you both know erlang, you will know what I'm talking about - that sort of approach really doesn't work in clojure, that's why clojure uses thread pools for agents etc. But maybe a new and lightweight in-kernel thread model is an interesing subject? Just asking :) Thanks for your suggestion. It's an interesting idea, and we've been considering something along those lines. I don't know if Erlang's processes are even more lightweight than green threads. According to Wikipedia [1], Linux actually performs really well in terms of context switching OS threads compared to green threads. But I can't find the paper Wikipedia cites, and it seems not to be downloadable from ACM [2]. In any case, we will take this suggestion into consideration. [1] http://en.wikipedia.org/wiki/Green_threads#Performance [2] http://portal.acm.org/citation.cfm?id=603551 -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.comclojure%2bunsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: Call for masters thesis ideas (possibly related to Clojure)
*Re: Update: One of the developers of PNUTS commentedhttp://glinden.blogspot.com/2009/02/details-on-yahoos-distributed-database.html?showComment=123388434#c1254841206330803677 on this post, pointing out that PNUTS performance is much better in practice (1-10ms/request) when caching layers are in place and making a few comparisons to Bigtable.* Note this very interesting conversation (below) which IMHO has potential for STM implementation in clojure http://clojure.org/refs (excerpt: PNUTs supports row-level transactions too-- they are done through a get followed by test-and-set. This is nothing new: it is called optimistic concurrency control and has been around in database literature for ages, and is also used by BigTable.). ... Can and would clojure help and simplify the construction of a framework for high-volume, high-availability, distributed web apps? Has something like PNUTs been implemented for Clojure and Java Clouds? http://www.blogger.com/profile/07702915403582319413 Brian Cooper http://www.blogger.com/profile/07702915403582319413 said... Hi folks, I'm another member of the PNUTS group at Yahoo! This has been a very interesting discussion; I'm glad you folks are as interested in this stuff as we are. Just to reiterate a few points that Utkarsh brought up: - There's no free lunch for performance, and if you want consistency some writes will have to go cross-datacenter, increasing the average latency. Cross-datacenter communication is required because of our mastership protocol. Consider a user who's record is mastered in California. If she flies to Europe, or a network problem causes her request to be redirected to a datacenter on the East coast, then her write will originate in a non-master datacenter and be forwarded back to the master for that record. In practice this happens 10-20% of the time, just because of the way web requests happen. Even if you had Paxos, and managed to put the local participants in the same datacenter, occasionally a write would originate in the non-local datacenter and pay the cross-datacenter latency to find enough members of the quorum. So this cost is really unavoidable. - You could weaken the consistency, to something like eventual consistency (write anywhere and resolve conflicts later) or even best effort (write anywhere and don't worry about conflicts) and avoid ever paying the cross-datacenter cost. And in fact it is possible to turn off mastership in PNUTS. But then you need a resolution protocol, and until conflicts are resolved inconsistent copies of the data are visible to readers. So again there is no free lunch. - Anonymous is write that this system is not as optimized for scans a la MapReduce. In fact, you make different architectural decisions if you are optimizing for low-latency updates to a geographic database than if you are optimizing for scanning for MapReduce within a single datacenter. We have been able to run Hadoop jobs over PNUTS and get performance that is pretty good, just not as good as a native store (HDFS) optimized for MapReduce. So if you want to transactionally update data with very low latency and occasionally run MapReduce, you can use PNUTS; if you want to always run MapReduce but don't need a lot of high performance updates, use HDFS. - As Utkarsh has said, the hardware used for the paper is not as good as production hardware (e.g. no battery-backed write cache, and other limitations). We hope to publish some numbers from live workloads on the production system soon. .. -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en