[ 
https://issues.apache.org/jira/browse/CASSANDRA-8609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14567188#comment-14567188
 ] 

Sam Tunnicliffe commented on CASSANDRA-8609:
--------------------------------------------

I don't think we can completely remove the dependency on internal classes in 
this way as it would remove the ability to write M/R jobs which use timestamp 
and ttl. While it doesn't break any of the bundled pig or hadoop examples, it's 
feasible for jobs out in the wild to be doing this. 

I think the right thing to do is to create a new simple class in the 
{{org.apache.cassandra.hadoop}} package to represent a column (much like the 
old {{org.apache.cassandra.db.Column}} from 2.0) and use that throughout the 
thrift side of the hadoop integration. The 
{{ColumnFamilyRecordReader#unthriftifyX}} methods should then be translating 
from the thrift classes into these new simple columns.

Also, the utility of {{AbstractCassandraStorage}} isn't clear to me. 
{{CassandraStorage}} doesn't extend it and I can't find any reference to it in 
the project at all (i.e. it isn't being tested/exercised by any of the demos as 
far as I can tell). Is there any reason why users writing their own 
{{LoadStoreFunc}} would choose to extend {{ACS}} rather than {{CS}}. At the 
very least, shouldn't it be marked deprecated like {{CS}}?

> Remove depency of hadoop to internals (Cell/CellName)
> -----------------------------------------------------
>
>                 Key: CASSANDRA-8609
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8609
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Sylvain Lebresne
>            Assignee: Philip Thompson
>             Fix For: 2.2.0 rc1
>
>         Attachments: 8609-2.2-2.txt, 8609-2.2.txt, 
> CASSANDRA-8609-3.0-branch.txt
>
>
> For some reason most of the Hadoop code (ColumnFamilyRecordReader, 
> CqlStorage, ...) uses the {{Cell}} and {{CellName}} classes. That dependency 
> is entirely artificial: all this code is really client code that communicate 
> with Cassandra over thrift/native protocol and there is thus no reason for it 
> to use internal classes. And in fact, thoses classes are used in a very crude 
> way, as a {{Pair<ByteBuffer, ByteBuffer>}} really.
> But this dependency is really painful when we make changes to the internals. 
> Further, every time we do so, I believe we break some of those the APIs due 
> to the change. This has been painful for CASSANDRA-5417 and this is now 
> painful for CASSANDRA-8099. But while I somewhat hack over it in 
> CASSANDRA-5417, this was a mistake and we should have removed the depency 
> back then. So let do that now.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to