sorry guys I sent notes out to the wrong list. Sent from my iPhone
On Mar 27, 2012, at 11:54, Ted Yu <[email protected]> wrote: > Can someone explain the notes below ? > > bq. - hbase deman was with hadoop > > bq. - scan kind of halfass, for hive. > > bq. - finace sector > > Thanks > > On Tue, Mar 27, 2012 at 11:49 AM, Jonathan Hsieh <[email protected]> wrote: > >> People: >> Cloudera: Todd, Dave W, Shaneel M, Jonathan H, Himanshu, Greg C, Matteo B >> (remote) >> FB: Nicolas, Amir >> >> druba - ubase/hstore - transactin processing, through hive-hbase >> integration. >> >> hbase team with hdfs team. >> - hbase deman was with hadoop >> >> NY - carve out hunk of HBase to work on. >> >> Long term: >> real time hive, deep integration. >> - beyond just translate to MR job. >> - Use in megastore. >> - scan kind of halfass, for hive. >> - previously point query optimization. >> - analystics too long to scan table. >> - doing on demand compression. >> >> Edgecases >> - finace sector >> - gpu cases. >> >> Uptime and availaiblity. >> - chaos monkey >> - poll all regions >> >> Hbase 0.89 - fast region failover. >> - down time down to.. >> >> Take down rack - test cases >> >> putting data node selection in master. >> - on per region basis, hash chain - so assigned secondary and tertiary. >> >> What is Cloudera focus? >> >> HDFS HA story >> - Talking to HW -- bookies in HDFS ("public story, but ...") >> - logs in hdfs. >> - Standby node. >> - zk flag - halfass solution. "double fails" not in scope. >> - todd: 3 journal daemons, quorom for edits, pluggable journal manager >> interface. >> >> Facebook - new data infrastructure >> - focus on quality, reliability, visibility. >> - upping rolling restart to improve monitoring >> >> HBase - stable depends on use case >> - pushing out use cases >> - ODS, (soon) >> - Puma analytics >> - ubase - researchy >> - site integrity >> - hash out cluster (generic kv store, persistent memcache ), multi-tenant >> cluster, "photo stuff" (haystack) >> - wormhole - backup replication - on hashout cluster, master slave, cross >> DC replication. >> >> Replication - talk to Madu >> >> HDFS hard links - on github. >> - at data node layer. >> - hari m - HW - hard links also. (claims working prototype) >> >> Kannan - >> >> pubsub, >> 2ndary index. >> native c++ thrift client. >> open sourcing folly (c++ stl) >> >> - distrbute log splitting task manager >> - ordering for bulk master operations, eliminate class of problems. >> >> Online schema changes >> - high friction to change >> - check column descriptor, then table, then configuraiton. >> - tune new features for column family. >> >> FB doesn't care about access control. >> - auditing - multi tenancy case. >> - specific app servers that will access - perms >> - FB will do security at a higher level >> >> >> >> -- >> // Jonathan Hsieh (shay) >> // Software Engineer, Cloudera >> // [email protected] >>
