[ https://issues.apache.org/jira/browse/CASSANDRA-2527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13276205#comment-13276205 ]
Jonathan Ellis commented on CASSANDRA-2527: ------------------------------------------- I guess the main drawback to directly reading the sstables is, you either have to set up some kind of external NFS sharing like Ilya did, or you have to be willing to live without non-local data access. I'd be okay with saying "yes, you can run m/r against snapshots, but only with local map tasks." Better than not having it at all... > Add ability to snapshot data as input to hadoop jobs > ---------------------------------------------------- > > Key: CASSANDRA-2527 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2527 > Project: Cassandra > Issue Type: Improvement > Reporter: Jeremy Hanna > Labels: hadoop > > It is desirable to have immutable inputs to hadoop jobs for the duration of > the job. That way re-execution of individual tasks do not alter the output. > One way to accomplish this would be to snapshot the data that is used as > input to a job. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira