Make Pig/CassandraStorage delete functionality disabled by default and 
configurable
-----------------------------------------------------------------------------------

                 Key: CASSANDRA-3628
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3628
             Project: Cassandra
          Issue Type: Task
            Reporter: Jeremy Hanna
            Assignee: Jeremy Hanna


Right now, there is a way to delete column with the CassandraStorage 
loadstorefunc.  In practice it is a bad idea to have that enabled by default.  
A scenario: do an outer join and you don't have a value for something and then 
you write out to cassandra all of the attributes of that relation.  You've just 
inadvertently deleted a column for all the rows that didn't have that value as 
a result of the outer join.  It can be argued that you want to be careful with 
how you project after the join.  However, I would think disabling by default 
and having a configurable property to enable it for the instances when you 
explicitly want to use it is the right plan.

Fwiw, we had a bug in one of our scripts that did exactly as described above.  
It's good to fix the bug.  It's bad to implicitly delete data.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to