On Jul 22, 5:27 am, Phillip B Oldham <phillip.old...@gmail.com> wrote: > I understand that there are a number of MapReduce frameworks/tools > that play nicely with Python (Disco, Dumbo/Hadoop), however these have > "large" dependencies (Erlang/Java). Are there any MapReduce frameworks/ > tools which are either pure-Python, or play nicely with Python but > don't require the Java/Erlang runtime environments?
I can't answer your question, but I would like to better understand the problem you are trying to solve. The Apache Hadoop/MapReduce java application isn't really that "large" by modern standards, although it is generally run with large heap sizes for performance (-Xmx1024m or larger for the mapred.child.java.opts parameter). MapReduce is designed to do extremely fast parallel data set processing on terabytes of data over hundreds of physical nodes; what advantage would a pure Python approach have here? -- http://mail.python.org/mailman/listinfo/python-list