Re: Call for masters thesis ideas (possibly related to Clojure)

2009-12-21 Thread kyle smith
I also vote for static analysis.  People have combined static analysis
with data mining to predict software defects.
http://scholar.google.com/scholar?hl=enq=data+mining+software+defectsbtnG=Searchas_sdt=2000as_ylo=as_vis=0
However, it doesn't look like anyone's tried it with any functional
languages.

-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: Call for masters thesis ideas (possibly related to Clojure)

2009-12-19 Thread jng27
What about adding circular dependency resolution to the compiler
between Clojure - Java code ?

I know, I know - it's not very sexy but it would go a long way for
companies with larger legacy Java code bases that may be considering
moving to Clojure. Being able to replace Java code 'one step at a
time' would be invaluable in that area and it also makes Clojure
easier to 'sell'.

On Dec 18, 4:35 am, Patrick Kristiansen
patrick.kristian...@gmail.com wrote:
 Hi

 We're two students that have been working with concurrent programming
 languages (Erlang and Clojure), and while both languages are very
 interesting, we would like to work on something related to Clojure in
 our masters thesis.

 I once asked on #clojure for ideas, and Rich Hickey suggested looking
 into predicate dispatch and it is one idea that we are considering. We
 have also considered working on distributed Clojure, but I don't know
 if there is already an effort going on in that regard?

 Do you have any other suggestions? We'd be really thankful for any
 suggestions.

 Thanks in advance.

 -Patrick

-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: Call for masters thesis ideas (possibly related to Clojure)

2009-12-19 Thread Micklat
How about static analysis? Plenty of interesting problems there. For
example:

1. detect incorrect uses of transients (those that would certainly
lead to exceptions) and report them during compile-time.

2. related to (1), but different: detect places where the compiler can
implicitly convert collections to transients and back, without losing
correctness, and use that as a speed-up.

3. soft typing, either to speed up code or to detect incorrect code
early. Such things have been done with Scheme.

I believe Clojure has certain advantages when working on static
analysis:
* you can call the reader and macro expander during analysis, which
greatly simplifies the language to be analyzed.
* the core language is relatively simple
* the compiler source is freely available and is written in a
relatively high-level language (java and not c), so it (should) be
relatively easy to get into.

Relatedly, it might be interesting to create formal semantics for
Clojure, and interface it with a proof assistant, enabling proofs to
be written and checked about the behavior of clojure programs.

In that context, one of the opportunities provided by lispiness is
that macros can help write the proof, by automatically creating
propositions about the code that they return. For example, (dotimes [i
n] body) can be modified to attach these propositions to its output,
using an appropriate proof language:
- i=0
- i(the result of) n
- the loop terminates if its body terminates

That's just scratching the surface. That field is deep and wide, and a
complete implementation is probably beyond the scope of a master's
theses. Still, parts of it can be done.

-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: Call for masters thesis ideas (possibly related to Clojure)

2009-12-19 Thread Patrick Kristiansen
On Dec 19, 1:52 am, ajay gopalakrishnan ajgop...@gmail.com wrote:
 Put

 *Comparative performance evaluation of Java threads for embedded
 applications**: Linux thread vs. Green thread

Your Google search skills are obviously beyond ours. :) I've found it
now.

-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: Call for masters thesis ideas (possibly related to Clojure)

2009-12-19 Thread Jules
Predicate dispatch would be an interesting topic. The #1 problem with
predicate dispatch is method ordering. Predicate dispatch as described
in the orignal paper [http://www.cs.washington.edu/homes/mernst/pubs/
dispatching-ecoop98-abstract.html] decides what order to used based on
implication between predicates. For example:

def foo(x) where x is Animal = A
def foo(x) where x is Chicken = B
def foo(x) where x is Bird = C

If you call foo(new Crow) you want the method with where x is Bird
to be called. A Crow is not a Chicken, so we can rule out the Chicken
method. But a Crow is also an Animal, so how does the dispatch decide
that it calls the Bird method and not the Animal method? The answer of
the authors of the predicate dispatch paper is that you determine
implication between predicates. If x is a Bird, that implies that x is
an Animal. So we test if x is a Bird and only after that we consider
the Animal method.

This is good but unfortunately checking implication between predicates
is undecidable. For example if we define this:

def bar(n) where 5^n == 5 mod n = A
def bar(n) where prime?(n) = B

How is the static analysis going to determine that the second
predicate implies the first? You can solve this case by requiring the
programmer to provide implication information to the compiler, like
this:

implication prime?(n) = 5^n == 5 mod n

As you can see this isn't nice. In my opinion we need a better way to
do method ordering. I've been thinking about this and my proposal is
to use the definition order in the source code to decide which methods
to check first: try methods that are defined last first.

def baz(x) where p(x) = A
def baz(x) where q(x) = B
def baz(x) where r(x) = C

We first check for r(x) and if it's true we execute C. If it's false
then we check if q(x) is true and execute B. If q(x) is false we check
p(x) and execute A.

The implementation of this kind of predicate dispatch is trivial. You
don't need a complicated implication checker. It's just a simple
macro. But the question is whether this simple method ordering rule
works in practice. With this order you can do everything that you can
do with a class hierarchy (in Java for example). It also subsumes
Clojure's hierarchies by using a predicate this is itself a method. I
think it's interesting to explore this further to see if this order
still works well for more advanced uses of predicate dispatch, and to
explore alternative orderings.

Some other research ideas:

- Functional reactive programming in Clojure. A problem with FRP is
that you have to invert dependencies. Instead of specifying for each
cause what its effects are, you have to specify for each effects what
its possible causes are. Can you solve this? The problem is much
easier if you don't try to prevent glitches as other FRP systems do.
Is a system without glitch prevention useful, is introducing glitches
for removing cause and effect inversion a good trade off?
- Type/value inference with abstract interpretation. Can you write an
abstact interpretation to infer possible values of a variable? Is this
useful in creating a better compiler. Can the information be used to
create better editors (intellisense?).
- In a statically typed language like Haskell  Scala you can write a
generic sum function that sums a list of things. You cannot do this in
any dynamic language that I know of. You want sum to work on any list
of things that support +. Suppose that numbers support + and strings
support + (maybe they shouldn't but bear with me for this example). sum
([1,2]) = 3. sum([ab,cd]) = abcd. Now comes the problem. What is
sum([])? For numbers it should be 0, for strings it should be . In a
statically typed language you can use the type of the list to
determine which value to return. How can we write a generic sum
function in a dynamic language?

Jules

On Dec 18, 1:35 pm, Patrick Kristiansen
patrick.kristian...@gmail.com wrote:
 Hi

 We're two students that have been working with concurrent programming
 languages (Erlang and Clojure), and while both languages are very
 interesting, we would like to work on something related to Clojure in
 our masters thesis.

 I once asked on #clojure for ideas, and Rich Hickey suggested looking
 into predicate dispatch and it is one idea that we are considering. We
 have also considered working on distributed Clojure, but I don't know
 if there is already an effort going on in that regard?

 Do you have any other suggestions? We'd be really thankful for any
 suggestions.

 Thanks in advance.

 -Patrick

-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: Call for masters thesis ideas (possibly related to Clojure)

2009-12-19 Thread Patrick Kristiansen
I want to thank you all for your suggestions, the clojure community is
really great!

On Dec 18, 1:35 pm, Patrick Kristiansen
patrick.kristian...@gmail.com wrote:
 Hi

 We're two students that have been working with concurrent programming
 languages (Erlang and Clojure), and while both languages are very
 interesting, we would like to work on something related to Clojure in
 our masters thesis.

 I once asked on #clojure for ideas, and Rich Hickey suggested looking
 into predicate dispatch and it is one idea that we are considering. We
 have also considered working on distributed Clojure, but I don't know
 if there is already an effort going on in that regard?

 Do you have any other suggestions? We'd be really thankful for any
 suggestions.

 Thanks in advance.

 -Patrick

-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Call for masters thesis ideas (possibly related to Clojure)

2009-12-18 Thread Patrick Kristiansen
Hi

We're two students that have been working with concurrent programming
languages (Erlang and Clojure), and while both languages are very
interesting, we would like to work on something related to Clojure in
our masters thesis.

I once asked on #clojure for ideas, and Rich Hickey suggested looking
into predicate dispatch and it is one idea that we are considering. We
have also considered working on distributed Clojure, but I don't know
if there is already an effort going on in that regard?

Do you have any other suggestions? We'd be really thankful for any
suggestions.

Thanks in advance.

-Patrick

-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: Call for masters thesis ideas (possibly related to Clojure)

2009-12-18 Thread ajay gopalakrishnan
Hi,

An idea I was interested in (more from an learning opportunity perspective
than thesis) is provided more Persistent data structures. As of now, we have
the the Array mapped Hash tree based structure that works well for Vectors,
Maps and Graphs too. What I feel is missing is Tree based structures.
Although, you can simulate them using Vectors as children indices, I think
if we directly implement the Persistent Data structure of Binomial/Fibonacci
Heaps and BSTs with all their operations (insert,delete,search,meld), it
would be much faster. I could be wrong though!

Anyways ... my 2 cents.

Thanks,
Ajay G.

-- Forwarded message --
From: Patrick Kristiansen patrick.kristian...@gmail.com
Date: Fri, Dec 18, 2009 at 7:35 AM
Subject: Call for masters thesis ideas (possibly related to Clojure)
To: Clojure clojure@googlegroups.com


Hi

We're two students that have been working with concurrent programming
languages (Erlang and Clojure), and while both languages are very
interesting, we would like to work on something related to Clojure in
our masters thesis.

I once asked on #clojure for ideas, and Rich Hickey suggested looking
into predicate dispatch and it is one idea that we are considering. We
have also considered working on distributed Clojure, but I don't know
if there is already an effort going on in that regard?

Do you have any other suggestions? We'd be really thankful for any
suggestions.

Thanks in advance.

-Patrick

--
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.comclojure%2bunsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Call for masters thesis ideas (possibly related to Clojure)

2009-12-18 Thread Sean Devlin
Most of the distributed Clojure work is based on running Clojure on a
distributed JVM package.

Here's a link on Clojure + Terracotta.  I have heard mention of other
solutions as well, I just don't recall them.

http://groups.google.com/group/clojure/browse_thread/thread/9a9690947617906/6bc50fb45ca56e7d

Sean

On Dec 18, 7:35 am, Patrick Kristiansen
patrick.kristian...@gmail.com wrote:
 Hi

 We're two students that have been working with concurrent programming
 languages (Erlang and Clojure), and while both languages are very
 interesting, we would like to work on something related to Clojure in
 our masters thesis.

 I once asked on #clojure for ideas, and Rich Hickey suggested looking
 into predicate dispatch and it is one idea that we are considering. We
 have also considered working on distributed Clojure, but I don't know
 if there is already an effort going on in that regard?

 Do you have any other suggestions? We'd be really thankful for any
 suggestions.

 Thanks in advance.

 -Patrick

-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: Call for masters thesis ideas (possibly related to Clojure)

2009-12-18 Thread Tassilo Horn
ajay gopalakrishnan ajgop...@gmail.com writes:

Hi Ajay,

 An idea I was interested in (more from an learning opportunity
 perspective than thesis) is provided more Persistent data structures.
 As of now, we have the the Array mapped Hash tree based structure that
 works well for Vectors, Maps and Graphs too.

In the last few month I didn't find any time to play around with clojure
or follow discussions on this list, so could you give me a pointer to
persistent graph structures?  That would be an area where I would be
utmost interested.

Bye,
Tassilo

-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: Call for masters thesis ideas (possibly related to Clojure)

2009-12-18 Thread Niels Mayer
(0)  Create a toolkit to run multiple parallel, tightly communicating
clojure apps on google-app engine, simulating a single, long-running,
multithreaded JVM instance that does not appear, to the user, to be limited
by the constraints of GAE's java implementation (e.g. single threading,
shared refs work despite being distributed); at the same time, the toolkit
would minimize resources consumed in GAE, persisting threads/continuations
waiting for data, and heuristically determining lowest cost for long-term
storage versus memory and runtime consumption.

Note: http://elhumidor.blogspot.com/2009/04/clojure-on-google-appengine.html

THE BIG CAVEAT


  Two unusual aspects of the Google AppEngine environment create pretty major

 constraints on your ability to write idiomatic Clojure.


  First, an AppEngine application runs in a security context that doesn't

 permit spawning threads, so you won't be able to use Agents, the

 clojure.parallel library, or Futures.


  Second, one of the most exciting features of AppEngine is that your

 application will be deployed on Google's huge infrastructure, dynamically

 changing its footprint depending on demand. That means you'll potentially be

 running on many JVMs at once. Unfortunately this is a strange fit for

 Clojure's concurrency features, which are most useful when you have precise

 control over what lives on what JVM (and simplest when everything runs on

 one JVM). Since shared references (Vars, Refs, and Atoms) are shared only

 within a single JVM, they are not suitable for many of their typical uses

 when running on AppEngine. You should still use Clojure's atomic references

 (and their associated means of modification) for any state that it makes

 sense to keep global per-JVM, since there may be multiple threads serving

 requests in one JVM. But remember JVMs will come and go during the lifetime

 of your application, so anything truly global should go in the Datastore or

 Memcache.


(1)  a clojure implementation of Yahoo's PNUTs, using STM's and all the cool
facilities clojure provides: http://research.yahoo.com/files/pnuts.pdf
(interesting
to have a writeup of a real-world impl alongside comparisons to Google
Bigtable and Amazon Dynamo)

We describe PNUTS, a massively parallel and geographically distributed
 database system for Yahoo!'s web applications.


 The foremost requirements of web applications are scalability, consistently
 good response time for geographically dispersed users, and high
 availability. At the same time, web applications can frequently tolerate
 relaxed consistency guarantees.


 For example, if a user changes an avatar ... little harm is done if the new
 avatar is not initially visible to one friend  It is often acceptable to
 read (slightly) stale data, but occasionally stronger guarantees are
 required by applications.


 PNUTS provides a consistency model that is between the two extremes of
 general serializability and eventual consistency ... We provide per-record
 timeline consistency: all replicas of a given record apply all updates to
 the record in the same order  The application [can] indicate cases where
 it can do with some relaxed consistency for higher performance  [such as
 reading] a possibly stale version of the record.


Some interesting commentary from
http://glinden.blogspot.com/2009/02/details-on-yahoos-distributed-database.html

http://glinden.blogspot.com/2009/02/details-on-yahoos-distributed-database.html
..
*
*
*When reading the paper, a couple things about PNUTS struck me as
surprising:

First, the system is layered on top of the guarantees of a reliable pub-sub
message broker which acts both as our replacement for a redo log and our
replication mechanism. I have to wonder if the choice to not build these
pieces of the database themselves could lead to missed opportunities for
improving performance and efficiency.

Second, as figures 3 and 4 show, the average latency of requests to their
database seems quite high, roughly 100 ms. This is high enough that web
applications probably would incur too much total latency if they made a few
requests serially (e.g. ask for some data, then, depending on what the data
looks like, ask for some other data). That seems like a problem.

Please see also my August 2006 post, Google Bigtable
paperhttp://glinden.blogspot.com/2006/08/google-bigtable-paper.html,
which discusses the distributed database behind many products at Google.

Please see also my earlier post, Highly available distributed hash store at
Amazonhttp://glinden.blogspot.com/2007/10/highly-available-distributed-hash.html,
on the distributed database behind some features at Amazon.com.

Please see also my earlier posts, Cassandra data store at
Facebookhttp://glinden.blogspot.com/2008/08/cassandra-data-store-at-facebook.html
and HBase: A Google Bigtable
clonehttp://glinden.blogspot.com/2007/07/hbase-google-bigtable-clone.html
.

Update: One of the developers of PNUTS

Re: Call for masters thesis ideas (possibly related to Clojure)

2009-12-18 Thread Joost
On 18 dec, 13:35, Patrick Kristiansen patrick.kristian...@gmail.com
wrote:
 Hi

 We're two students that have been working with concurrent programming
 languages (Erlang and Clojure), and while both languages are very
 interesting, we would like to work on something related to Clojure in
 our masters thesis.

 I once asked on #clojure for ideas, and Rich Hickey suggested looking
 into predicate dispatch and it is one idea that we are considering. We
 have also considered working on distributed Clojure, but I don't know
 if there is already an effort going on in that regard?

 Do you have any other suggestions? We'd be really thankful for any
 suggestions.

Erm, what's your master? I'll assume CS.

Personally, I'm interested in whether complete thread abstraction that
makes threads as light-weight as possible, but also the only way to
do concurrency (like Erlang provides with its processes) is really
the best way to model concurrent programs. I'm over 95% sure that
native threads really are not the best way to model in-process
concurrency for most programs, simply because of all the overhead that
you incur especially when dealing with massively concurrent long-
running thead-like-things - since you both know erlang, you will know
what I'm talking about - that sort of approach really doesn't work in
clojure, that's why clojure uses thread pools for agents etc.

But maybe a new and lightweight in-kernel thread model is an
interesing subject? Just asking :)

Good luck with your thesis,
Joost Diepenmaat.

-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: Call for masters thesis ideas (possibly related to Clojure)

2009-12-18 Thread Patrick Kristiansen
On Dec 18, 11:06 pm, Joost jo...@zeekat.nl wrote:
 Erm, what's your master? I'll assume CS.

Well it's software engineering, but close enough :). I should have
mentioned that.

 Personally, I'm interested in whether complete thread abstraction that
 makes threads as light-weight as possible, but also the only way to
 do concurrency (like Erlang provides with its processes) is really
 the best way to model concurrent programs. I'm over 95% sure that
 native threads really are not the best way to model in-process
 concurrency for most programs, simply because of all the overhead that
 you incur especially when dealing with massively concurrent long-
 running thead-like-things - since you both know erlang, you will know
 what I'm talking about - that sort of approach really doesn't work in
 clojure, that's why clojure uses thread pools for agents etc.

 But maybe a new and lightweight in-kernel thread model is an
 interesing subject? Just asking :)

Thanks for your suggestion. It's an interesting idea, and we've been
considering something along those lines. I don't know if Erlang's
processes are even more lightweight than green threads. According to
Wikipedia [1], Linux actually performs really well in terms of context
switching OS threads compared to green threads. But I can't find the
paper Wikipedia cites, and it seems not to be downloadable from ACM
[2].

In any case, we will take this suggestion into consideration.

[1] http://en.wikipedia.org/wiki/Green_threads#Performance
[2] http://portal.acm.org/citation.cfm?id=603551

-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: Call for masters thesis ideas (possibly related to Clojure)

2009-12-18 Thread Joost
On 19 dec, 01:20, Patrick Kristiansen patrick.kristian...@gmail.com
wrote:
 Thanks for your suggestion. It's an interesting idea, and we've been
 considering something along those lines. I don't know if Erlang's
 processes are even more lightweight than green threads. According to
 Wikipedia [1], Linux actually performs really well in terms of context
 switching OS threads compared to green threads. But I can't find the
 paper Wikipedia cites, and it seems not to be downloadable from ACM
 [2].

The problem seems to be two part:

1. Starting new native thread takes a relatively long time

2. Native threads means a fairly significant memory and scheduler
consumption. In Java-based threads, that means for example that a
20,000 connection server is pretty much the maximum that's possible in
that model. In Erland, 20,000 concurrent connections is what you
should expect from an unoptimized system. This is at least partly due
to kernel design, but also due to language design. A purely side-
effect free language is much more capable of sharing data. In *that*
regard, it's probably easier to optimize Erlang than Clojure.

But in any case, massive networking is the key  - In clojure, there is
plenty of room to improve in that regard. In Erlang probably too, but
erlang has no really interesing memory sharing mechanisms.

-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: Call for masters thesis ideas (possibly related to Clojure)

2009-12-18 Thread ajay gopalakrishnan
Put

*Comparative performance evaluation of Java threads for embedded
applications**: Linux thread vs. Green thread

Google, and click the second link that says [PDF]. It is for free.
*
On Fri, Dec 18, 2009 at 7:20 PM, Patrick Kristiansen 
patrick.kristian...@gmail.com wrote:

 On Dec 18, 11:06 pm, Joost jo...@zeekat.nl wrote:
  Erm, what's your master? I'll assume CS.

 Well it's software engineering, but close enough :). I should have
 mentioned that.

  Personally, I'm interested in whether complete thread abstraction that
  makes threads as light-weight as possible, but also the only way to
  do concurrency (like Erlang provides with its processes) is really
  the best way to model concurrent programs. I'm over 95% sure that
  native threads really are not the best way to model in-process
  concurrency for most programs, simply because of all the overhead that
  you incur especially when dealing with massively concurrent long-
  running thead-like-things - since you both know erlang, you will know
  what I'm talking about - that sort of approach really doesn't work in
  clojure, that's why clojure uses thread pools for agents etc.
 
  But maybe a new and lightweight in-kernel thread model is an
  interesing subject? Just asking :)

 Thanks for your suggestion. It's an interesting idea, and we've been
 considering something along those lines. I don't know if Erlang's
 processes are even more lightweight than green threads. According to
 Wikipedia [1], Linux actually performs really well in terms of context
 switching OS threads compared to green threads. But I can't find the
 paper Wikipedia cites, and it seems not to be downloadable from ACM
 [2].

 In any case, we will take this suggestion into consideration.

 [1] http://en.wikipedia.org/wiki/Green_threads#Performance
 [2] http://portal.acm.org/citation.cfm?id=603551

 --
 You received this message because you are subscribed to the Google
 Groups Clojure group.
 To post to this group, send email to clojure@googlegroups.com
 Note that posts from new members are moderated - please be patient with
 your first post.
 To unsubscribe from this group, send email to
 clojure+unsubscr...@googlegroups.comclojure%2bunsubscr...@googlegroups.com
 For more options, visit this group at
 http://groups.google.com/group/clojure?hl=en


-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Call for masters thesis ideas (possibly related to Clojure)

2009-12-18 Thread Niels Mayer

 *Re:
 Update: One of the developers of PNUTS 
 commentedhttp://glinden.blogspot.com/2009/02/details-on-yahoos-distributed-database.html?showComment=123388434#c1254841206330803677
  on
 this post, pointing out that PNUTS performance is much better in practice
 (1-10ms/request) when caching layers are in place and making a few
 comparisons to Bigtable.*


Note this very interesting conversation  (below) which IMHO has potential
for STM implementation in clojure http://clojure.org/refs (excerpt: PNUTs
supports row-level transactions too-- they are done through a get followed
by test-and-set. This is nothing new: it is called optimistic concurrency
control and has been around in database literature for ages, and is also
used by BigTable.).

...

Can and would clojure help and simplify the construction of a framework for
high-volume, high-availability, distributed web apps? Has something like
PNUTs been implemented for Clojure and Java Clouds?


http://www.blogger.com/profile/07702915403582319413
Brian Cooper http://www.blogger.com/profile/07702915403582319413 said...

Hi folks,

I'm another member of the PNUTS group at Yahoo! This has been a very
interesting discussion; I'm glad you folks are as interested in this stuff
as we are.

Just to reiterate a few points that Utkarsh brought up:

- There's no free lunch for performance, and if you want consistency some
writes will have to go cross-datacenter, increasing the average latency.
Cross-datacenter communication is required because of our mastership
protocol. Consider a user who's record is mastered in California. If she
flies to Europe, or a network problem causes her request to be redirected to
a datacenter on the East coast, then her write will originate in a
non-master datacenter and be forwarded back to the master for that record.
In practice this happens 10-20% of the time, just because of the way web
requests happen.

Even if you had Paxos, and managed to put the local participants in the
same datacenter, occasionally a write would originate in the non-local
datacenter and pay the cross-datacenter latency to find enough members of
the quorum. So this cost is really unavoidable.

- You could weaken the consistency, to something like eventual consistency
(write anywhere and resolve conflicts later) or even best effort (write
anywhere and don't worry about conflicts) and avoid ever paying the
cross-datacenter cost. And in fact it is possible to turn off mastership in
PNUTS. But then you need a resolution protocol, and until conflicts are
resolved inconsistent copies of the data are visible to readers. So again
there is no free lunch.

- Anonymous is write that this system is not as optimized for scans a la
MapReduce. In fact, you make different architectural decisions if you are
optimizing for low-latency updates to a geographic database than if you are
optimizing for scanning for MapReduce within a single datacenter. We have
been able to run Hadoop jobs over PNUTS and get performance that is pretty
good, just not as good as a native store (HDFS) optimized for MapReduce. So
if you want to transactionally update data with very low latency and
occasionally run MapReduce, you can use PNUTS; if you want to always run
MapReduce but don't need a lot of high performance updates, use HDFS.

- As Utkarsh has said, the hardware used for the paper is not as good as
production hardware (e.g. no battery-backed write cache, and other
limitations). We hope to publish some numbers from live workloads on the
production system soon.
..

-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en