[jira] Commented: (CASSANDRA-1101) A Hadoop Output Format That Targets Cassandra

Karthick Sankarachary (JIRA) Thu, 20 May 2010 19:00:40 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-1101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12869835#action_12869835
 ]


Karthick Sankarachary commented on CASSANDRA-1101:
--------------------------------------------------

{quote}After looking through the code, I'm not sure I'm convinced that the 
OutputCommitter is necessary. I understand that the intention is that if a Map 
task fails, you'd want to avoid writing its output to Cassandra, but I think a 
committer is only really useful for OutputFormats where you get atomic 
behaviour (via RDBMS transactions, or via an atomic HDFS rename). It is just as 
likely (probably more likely) that the OutputCommitter fails halfway through 
its mutations as it is that the RecordWriter fails halfway through writing, and 
then your column family is left in an in-between state anyway. 
{quote}

Agreed. Considering that Cassandra doesn't qualify as a truly transactional 
system, at least not yet, bringing the OutputCommitter in the mix might be a 
tad misleading. So, as you suggested, I've moved the pseudo-commit logic that 
used to be in ColumnFamilyOutputCommitter#commitTask into 
ColumnFamilyRecordWriter#close.

{quote}Also, could you make the SlicePredicate and get_range_slices call in 
checkOutputSpecs optional, and if the predicate isn't present in the 
Configuration, skip the check? I expect that a lot of people will be fine with 
overwriting whatever exists, and needing to subclass to do it would be an 
unnecessary roadblock.{quote}

Good point. Now, if no slice predicate is present in the Configuration, then it 
won't bail out.

{quote}Third paragraph in ColumnFamilyOutputFormat javadoc is stale{quote}
Updated that paragraph to reflect the changes above.

{quote}EndpointCallable.call doesn't close the Thrift socket it opens{quote}
Good catch. 

The revised patch has been attached at the same location as before.

Finally, going back to your first point about Cassandra not being 
transactional, is that something that's worth looking into? I've done some work 
in that area, such as the atomic feature described in ODE-396, among other 
things.

> A Hadoop Output Format That Targets Cassandra
> ---------------------------------------------
>
>                 Key: CASSANDRA-1101
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1101
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Hadoop
>    Affects Versions: 0.6.1
>            Reporter: Karthick Sankarachary
>            Assignee: Stu Hood
>         Attachments: CASSANDRA-1101-V1.patch, CASSANDRA-1101.patch
>
>
> Currently, there exists a Hadoop-specific input format (viz., 
> ColumnFamilyInputFormat) that allows one to iterate over the rows in a given 
> Cassandra column family and treat it as the input to a Hadoop map task. By 
> the same token, one may need to feed the output of a Hadoop reduce task into 
> a Cassandra column family, for which no mechanism exists today. This calls 
> for the definition of a Hadoop-specific output format which accepts a pair of 
> key and columns, and writes it out to a given column family.
> Here, we describe an output format known as ColumnFamilyOutputFormat, which 
> allows reduce tasks to persist keys and their associated columns as Cassandra 
> rows in a given column family.  By default, it prevents overwriting existing 
> rows in the column family, by ensuring at initialization time that it 
> contains no rows in the given slice predicate. For the sake of speed, it 
> employs a lazy write-back caching mechanism, where its record writer batches 
> mutations created based on the reduce's inputs (in a task-specific map) but 
> stops short of actually mutating the rows. The latter responsibility falls on 
> its output committer, which makes the changes official by sending a batch 
> mutate request to Cassandra.  
> The record writer, which is called ColumnFamilyRecordWriter, maps the input 
> <key, value> pairs to a Cassandra column family. In particular, it creates 
> mutations for each column in the value, which it then associates with the 
> key, and in turn the responsible endpoint.  Note that, given that round trips 
> to the server are fairly expensive, it merely batches the mutations 
> in-memory, and leaves it on the output committer to send the batched 
> mutations to the server.  Furthermore, the writer groups the mutations by the 
> endpoint responsible for the rows being affected. This allows the output 
> committer to execute the mutations in parallel, on an endpoint-by-endpoint 
> basis.
> The output committer, which is called ColumnFamilyOutputCommitter, traverses 
> the mutations collected by the record writer, and sends them to the endpoints 
> responsible for them. Since the total set of mutations is partitioned by 
> their endpoints, each of which can be performed in parallel, it allows us to 
> commit the mutations using multiple threads, one per endpoint. As a result, 
> it reduces the time it takes to propagate the mutations to the server 
> considering that (a) the client eliminates one network hop that the server 
> would otherwise have had to make and (b) each endpoint node has to deal with 
> but a sub-set of the total set of mutations.
> For convenience, we also define a default reduce task, called 
> ColumnFamilyOutputReducer, which collects the columns in the input value and 
> maps them to a data structure expected by Cassandra. By default, it assumes 
> the input value to be in the form of a ColumnWritable, which denotes a name 
> value pair corresponding to a certain column. This reduce task is in turn 
> used by the attached test case, which maps every <key, value> pair in a 
> sample input sequence file to a <key, column> pair, and then reduces them by 
> aggregating columns corresponding to the same key. Eventually, the batched 
> <key, columns> pairs are written to the column family associated with the 
> output format.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-1101) A Hadoop Output Format That Targets Cassandra

Reply via email to