Dear Wiki user, You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for change notification.
The "HadoopSupport" page has been changed by JonathanEllis. http://wiki.apache.org/cassandra/HadoopSupport?action=diff&rev1=2&rev2=3 -------------------------------------------------- - Cassandra version 0.6 and later support running Hadoop jobs against data in Cassandra, out of the box. See https://svn.apache.org/repos/asf/cassandra/trunk/contrib/word_count/ for an example. (Inserting the ''output'' of a Hadoop job into Cassandra has always been possible.) + Cassandra version 0.6 and later support running Hadoop jobs against data in Cassandra, out of the box. See https://svn.apache.org/repos/asf/cassandra/trunk/contrib/word_count/ for an example. (Inserting the ''output'' of a Hadoop job into Cassandra has always been possible.) Cassandra rows or row fragments (that is, pairs of (key, SortedMap of columns) are input to Map tasks for processing by your job, as specified by a `SlicePredicate` that describes which columns to fetch from each row. Here's how this looks in the word_count example, which selects just one configurable columnName from each row: + {{{ + ConfigHelper.setColumnFamily(job.getConfiguration(), KEYSPACE, COLUMN_FAMILY); + SlicePredicate predicate = new SlicePredicate().setColumn_names(Arrays.asList(columnName.getBytes())); + ConfigHelper.setSlicePredicate(job.getConfiguration(), predicate); + }}} - Cassandra also provides a [[http://hadoop.apache.org/pig/|Pig]] !LoadFunc for running jobs in the Pig DSL instead of writing Java code by hand. This is in https://svn.apache.org/repos/asf/cassandra/trunk/contrib/pig/. + Cassandra also provides a [[http://hadoop.apache.org/pig/|Pig]] `LoadFunc` for running jobs in the Pig DSL instead of writing Java code by hand. This is in https://svn.apache.org/repos/asf/cassandra/trunk/contrib/pig/.