On 11/23/2013 03:01 PM, Jonathan Dursi wrote: > On Nov 23, 2013, at 1:40PM, Joe Landman > <[email protected]> wrote: > >> That is, we as a community have much to offer the growing big data >> community. > > I think this is completely true, and somewhat urgent. The two > communities have a lot to teach each other. > > The big data community remains incredibly naive about a lot of > performance/scalability issues - and of course they are, they’ve only > been at this a few years. Traditional HPC has a *lot* of hard-won > knowledge and experience to offer. > > But conversely, where we’ve been naive is the importance of easily > deployable, scalable, easy-to-develop-for software frameworks, even > if it initially comes at substantial cost in terms of > single-processor performance. If we choose not to learn the lessons > of rapid growth of tools like Hadoop, we are in trouble as a > community.
Absolutely. > We’ve talked for years about how hardware is advancing more rapidly > than software, but not done much about it; now someone has, and it’s > not us. As a result, people are already trying to fit very HPCy > sorts of problems into Hadoopy sorts of frameworks (cf, all the BSP > stuff in Pregel or Hama) because it’s so much easier to get things > working, and so much easier to find developers to maintain. When it > comes to choosing a direction for a new project, 100x the number of > developers will always win over single-processor performance, or even > scaling, because you can then direct enormous amounts of resources to > fixing performance issues in the underlying frameworks. I am a huge believer in plug-in-turn-on-walk-away. That should be all there is to configuration. Cluster distros should be gone. Not that Chef/Puppet are the right way to go (there are many reasons why they aren't IMO), but there are some fantastic concept coming in from the cloud side (Docker.io, smartos, ...) that we need to collectively leverage. But likewise, we still don't quite have resilient computation down, among other things. Checkpointing a job is, in many cases, simply not a viable option. Our job schedulers are cool, but designed for a different era. > > Jonathan > -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics, Inc. email: [email protected] web : http://scalableinformatics.com twtr : @scalableinfo phone: +1 734 786 8423 x121 cell : +1 734 612 4615 _______________________________________________ Beowulf mailing list, [email protected] sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
