[jira] [Commented] (CASSANDRA-8609) Remove depency of hadoop to internals (Cell/CellName)
[ https://issues.apache.org/jira/browse/CASSANDRA-8609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14570856#comment-14570856 ] Philip Thompson commented on CASSANDRA-8609: In that case, +1 to the patch, with logback set to WARN. > Remove depency of hadoop to internals (Cell/CellName) > - > > Key: CASSANDRA-8609 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8609 > Project: Cassandra > Issue Type: Bug >Reporter: Sylvain Lebresne >Assignee: Sam Tunnicliffe > Fix For: 2.2.0 rc1 > > Attachments: 8609-2.2-2.txt, 8609-2.2.txt, > CASSANDRA-8609-3.0-branch.txt > > > For some reason most of the Hadoop code (ColumnFamilyRecordReader, > CqlStorage, ...) uses the {{Cell}} and {{CellName}} classes. That dependency > is entirely artificial: all this code is really client code that communicate > with Cassandra over thrift/native protocol and there is thus no reason for it > to use internal classes. And in fact, thoses classes are used in a very crude > way, as a {{Pair}} really. > But this dependency is really painful when we make changes to the internals. > Further, every time we do so, I believe we break some of those the APIs due > to the change. This has been painful for CASSANDRA-5417 and this is now > painful for CASSANDRA-8099. But while I somewhat hack over it in > CASSANDRA-5417, this was a mistake and we should have removed the depency > back then. So let do that now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8609) Remove depency of hadoop to internals (Cell/CellName)
[ https://issues.apache.org/jira/browse/CASSANDRA-8609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14570443#comment-14570443 ] Sam Tunnicliffe commented on CASSANDRA-8609: I set it to ERROR because WARN includes lots of noise from the registering/unregistering of Hadoop metrics mbeans. It was helpful for me to filter that stuff out but it's easy to do yourself when running the tests locally so I'm fine with reverting the change to logback-test.xml if it helps with diagnosing failures on cassci. > Remove depency of hadoop to internals (Cell/CellName) > - > > Key: CASSANDRA-8609 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8609 > Project: Cassandra > Issue Type: Bug >Reporter: Sylvain Lebresne >Assignee: Sam Tunnicliffe > Fix For: 2.2.0 rc1 > > Attachments: 8609-2.2-2.txt, 8609-2.2.txt, > CASSANDRA-8609-3.0-branch.txt > > > For some reason most of the Hadoop code (ColumnFamilyRecordReader, > CqlStorage, ...) uses the {{Cell}} and {{CellName}} classes. That dependency > is entirely artificial: all this code is really client code that communicate > with Cassandra over thrift/native protocol and there is thus no reason for it > to use internal classes. And in fact, thoses classes are used in a very crude > way, as a {{Pair}} really. > But this dependency is really painful when we make changes to the internals. > Further, every time we do so, I believe we break some of those the APIs due > to the change. This has been painful for CASSANDRA-5417 and this is now > painful for CASSANDRA-8099. But while I somewhat hack over it in > CASSANDRA-5417, this was a mistake and we should have removed the depency > back then. So let do that now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8609) Remove depency of hadoop to internals (Cell/CellName)
[ https://issues.apache.org/jira/browse/CASSANDRA-8609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14569757#comment-14569757 ] Philip Thompson commented on CASSANDRA-8609: Why set the hadoop logging to ERROR instead of WARN? Some of the WARN messages have been useful in the past when debugging pig tests. The switch to ColumnFamilyRecordReader.Column seems reasonable/correct. I can't find anywhere you've missed removing Cell or CellName. All of the other changes match what Alex/I did. Results on cassci look good. No objections to removing ACS from me. I'd prefer you set logging to WARN, but either way, +1. > Remove depency of hadoop to internals (Cell/CellName) > - > > Key: CASSANDRA-8609 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8609 > Project: Cassandra > Issue Type: Bug >Reporter: Sylvain Lebresne >Assignee: Sam Tunnicliffe > Fix For: 2.2.0 rc1 > > Attachments: 8609-2.2-2.txt, 8609-2.2.txt, > CASSANDRA-8609-3.0-branch.txt > > > For some reason most of the Hadoop code (ColumnFamilyRecordReader, > CqlStorage, ...) uses the {{Cell}} and {{CellName}} classes. That dependency > is entirely artificial: all this code is really client code that communicate > with Cassandra over thrift/native protocol and there is thus no reason for it > to use internal classes. And in fact, thoses classes are used in a very crude > way, as a {{Pair}} really. > But this dependency is really painful when we make changes to the internals. > Further, every time we do so, I believe we break some of those the APIs due > to the change. This has been painful for CASSANDRA-5417 and this is now > painful for CASSANDRA-8099. But while I somewhat hack over it in > CASSANDRA-5417, this was a mistake and we should have removed the depency > back then. So let do that now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8609) Remove depency of hadoop to internals (Cell/CellName)
[ https://issues.apache.org/jira/browse/CASSANDRA-8609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14569536#comment-14569536 ] Sam Tunnicliffe commented on CASSANDRA-8609: Not just yet I'm afraid, there are still some dependencies on internals like {{IPartitioner}}, {{AbstractType}} in determining splits. Ideally, the Hadoop integration code shouldn't have any dependencies beyond the cql driver and thrift (and client-utils). For that to happen though, we'll need to duplicate a few internal things to enable the split sizing done entirely client side along the lines of [SPARKC-94|https://datastax-oss.atlassian.net/browse/SPARKC-94]). Maybe that should be done as part of CASSANDRA-9353. There are also some dependencies in {{CqlBulkRecordWriter}} ({{CFMetaData}}, {{Config}}, {{DatabaseDescriptor}}) which can't easily be removed yet. > Remove depency of hadoop to internals (Cell/CellName) > - > > Key: CASSANDRA-8609 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8609 > Project: Cassandra > Issue Type: Bug >Reporter: Sylvain Lebresne >Assignee: Sam Tunnicliffe > Fix For: 2.2.0 rc1 > > Attachments: 8609-2.2-2.txt, 8609-2.2.txt, > CASSANDRA-8609-3.0-branch.txt > > > For some reason most of the Hadoop code (ColumnFamilyRecordReader, > CqlStorage, ...) uses the {{Cell}} and {{CellName}} classes. That dependency > is entirely artificial: all this code is really client code that communicate > with Cassandra over thrift/native protocol and there is thus no reason for it > to use internal classes. And in fact, thoses classes are used in a very crude > way, as a {{Pair}} really. > But this dependency is really painful when we make changes to the internals. > Further, every time we do so, I believe we break some of those the APIs due > to the change. This has been painful for CASSANDRA-5417 and this is now > painful for CASSANDRA-8099. But while I somewhat hack over it in > CASSANDRA-5417, this was a mistake and we should have removed the depency > back then. So let do that now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8609) Remove depency of hadoop to internals (Cell/CellName)
[ https://issues.apache.org/jira/browse/CASSANDRA-8609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14569473#comment-14569473 ] Jeremiah Jordan commented on CASSANDRA-8609: bq. I don't think we can completely remove the dependency on internal classes in this way as it would remove the ability to write M/R jobs which use timestamp and ttl. [~beobal] can we at least make sure that they are not used anywhere in the non thrift code (including common base classes)? If we rm all the thrift using sub-classes we should not need to use any internal stuff, as everything should be using classes out of the java driver at that point. no? > Remove depency of hadoop to internals (Cell/CellName) > - > > Key: CASSANDRA-8609 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8609 > Project: Cassandra > Issue Type: Bug >Reporter: Sylvain Lebresne >Assignee: Sam Tunnicliffe > Fix For: 2.2.0 rc1 > > Attachments: 8609-2.2-2.txt, 8609-2.2.txt, > CASSANDRA-8609-3.0-branch.txt > > > For some reason most of the Hadoop code (ColumnFamilyRecordReader, > CqlStorage, ...) uses the {{Cell}} and {{CellName}} classes. That dependency > is entirely artificial: all this code is really client code that communicate > with Cassandra over thrift/native protocol and there is thus no reason for it > to use internal classes. And in fact, thoses classes are used in a very crude > way, as a {{Pair}} really. > But this dependency is really painful when we make changes to the internals. > Further, every time we do so, I believe we break some of those the APIs due > to the change. This has been painful for CASSANDRA-5417 and this is now > painful for CASSANDRA-8099. But while I somewhat hack over it in > CASSANDRA-5417, this was a mistake and we should have removed the depency > back then. So let do that now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8609) Remove depency of hadoop to internals (Cell/CellName)
[ https://issues.apache.org/jira/browse/CASSANDRA-8609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14569469#comment-14569469 ] Sam Tunnicliffe commented on CASSANDRA-8609: Pushed a new branch with the changes mentioned above, plus some further cleanup to the bundled examples. I've removed {{AbstractCassandraStorage}} as it seems it should have actually gone in CASSANDRA-8358. Diff [here|https://github.com/apache/cassandra/compare/cassandra-2.2...beobal:8609-2.2] & I'll update with test results when cassci has run. > Remove depency of hadoop to internals (Cell/CellName) > - > > Key: CASSANDRA-8609 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8609 > Project: Cassandra > Issue Type: Bug >Reporter: Sylvain Lebresne >Assignee: Sam Tunnicliffe > Fix For: 2.2.0 rc1 > > Attachments: 8609-2.2-2.txt, 8609-2.2.txt, > CASSANDRA-8609-3.0-branch.txt > > > For some reason most of the Hadoop code (ColumnFamilyRecordReader, > CqlStorage, ...) uses the {{Cell}} and {{CellName}} classes. That dependency > is entirely artificial: all this code is really client code that communicate > with Cassandra over thrift/native protocol and there is thus no reason for it > to use internal classes. And in fact, thoses classes are used in a very crude > way, as a {{Pair}} really. > But this dependency is really painful when we make changes to the internals. > Further, every time we do so, I believe we break some of those the APIs due > to the change. This has been painful for CASSANDRA-5417 and this is now > painful for CASSANDRA-8099. But while I somewhat hack over it in > CASSANDRA-5417, this was a mistake and we should have removed the depency > back then. So let do that now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8609) Remove depency of hadoop to internals (Cell/CellName)
[ https://issues.apache.org/jira/browse/CASSANDRA-8609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14567188#comment-14567188 ] Sam Tunnicliffe commented on CASSANDRA-8609: I don't think we can completely remove the dependency on internal classes in this way as it would remove the ability to write M/R jobs which use timestamp and ttl. While it doesn't break any of the bundled pig or hadoop examples, it's feasible for jobs out in the wild to be doing this. I think the right thing to do is to create a new simple class in the {{org.apache.cassandra.hadoop}} package to represent a column (much like the old {{org.apache.cassandra.db.Column}} from 2.0) and use that throughout the thrift side of the hadoop integration. The {{ColumnFamilyRecordReader#unthriftifyX}} methods should then be translating from the thrift classes into these new simple columns. Also, the utility of {{AbstractCassandraStorage}} isn't clear to me. {{CassandraStorage}} doesn't extend it and I can't find any reference to it in the project at all (i.e. it isn't being tested/exercised by any of the demos as far as I can tell). Is there any reason why users writing their own {{LoadStoreFunc}} would choose to extend {{ACS}} rather than {{CS}}. At the very least, shouldn't it be marked deprecated like {{CS}}? > Remove depency of hadoop to internals (Cell/CellName) > - > > Key: CASSANDRA-8609 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8609 > Project: Cassandra > Issue Type: Bug >Reporter: Sylvain Lebresne >Assignee: Philip Thompson > Fix For: 2.2.0 rc1 > > Attachments: 8609-2.2-2.txt, 8609-2.2.txt, > CASSANDRA-8609-3.0-branch.txt > > > For some reason most of the Hadoop code (ColumnFamilyRecordReader, > CqlStorage, ...) uses the {{Cell}} and {{CellName}} classes. That dependency > is entirely artificial: all this code is really client code that communicate > with Cassandra over thrift/native protocol and there is thus no reason for it > to use internal classes. And in fact, thoses classes are used in a very crude > way, as a {{Pair}} really. > But this dependency is really painful when we make changes to the internals. > Further, every time we do so, I believe we break some of those the APIs due > to the change. This has been painful for CASSANDRA-5417 and this is now > painful for CASSANDRA-8099. But while I somewhat hack over it in > CASSANDRA-5417, this was a mistake and we should have removed the depency > back then. So let do that now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8609) Remove depency of hadoop to internals (Cell/CellName)
[ https://issues.apache.org/jira/browse/CASSANDRA-8609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552968#comment-14552968 ] Philip Thompson commented on CASSANDRA-8609: Here are the CI results including the fix from CASSANDRA-9442. Once that is committed, you can review. http://cassci.datastax.com/view/Dev/view/ptnapoleon/job/ptnapoleon-cassandra-8609-testall/lastCompletedBuild/testReport/ > Remove depency of hadoop to internals (Cell/CellName) > - > > Key: CASSANDRA-8609 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8609 > Project: Cassandra > Issue Type: Bug >Reporter: Sylvain Lebresne >Assignee: Philip Thompson > Fix For: 2.2.0 rc1 > > Attachments: 8609-2.2.txt, CASSANDRA-8609-3.0-branch.txt > > > For some reason most of the Hadoop code (ColumnFamilyRecordReader, > CqlStorage, ...) uses the {{Cell}} and {{CellName}} classes. That dependency > is entirely artificial: all this code is really client code that communicate > with Cassandra over thrift/native protocol and there is thus no reason for it > to use internal classes. And in fact, thoses classes are used in a very crude > way, as a {{Pair}} really. > But this dependency is really painful when we make changes to the internals. > Further, every time we do so, I believe we break some of those the APIs due > to the change. This has been painful for CASSANDRA-5417 and this is now > painful for CASSANDRA-8099. But while I somewhat hack over it in > CASSANDRA-5417, this was a mistake and we should have removed the depency > back then. So let do that now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8609) Remove depency of hadoop to internals (Cell/CellName)
[ https://issues.apache.org/jira/browse/CASSANDRA-8609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14549169#comment-14549169 ] Philip Thompson commented on CASSANDRA-8609: Oh, you will also notice a probably out of scope change in CqlRecordWriter. I've decided it's probably best to move that to another ticket, and block this. > Remove depency of hadoop to internals (Cell/CellName) > - > > Key: CASSANDRA-8609 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8609 > Project: Cassandra > Issue Type: Bug >Reporter: Sylvain Lebresne >Assignee: Philip Thompson > Fix For: 2.2 rc1 > > Attachments: 8609-2.2.txt, CASSANDRA-8609-3.0-branch.txt > > > For some reason most of the Hadoop code (ColumnFamilyRecordReader, > CqlStorage, ...) uses the {{Cell}} and {{CellName}} classes. That dependency > is entirely artificial: all this code is really client code that communicate > with Cassandra over thrift/native protocol and there is thus no reason for it > to use internal classes. And in fact, thoses classes are used in a very crude > way, as a {{Pair}} really. > But this dependency is really painful when we make changes to the internals. > Further, every time we do so, I believe we break some of those the APIs due > to the change. This has been painful for CASSANDRA-5417 and this is now > painful for CASSANDRA-8099. But while I somewhat hack over it in > CASSANDRA-5417, this was a mistake and we should have removed the depency > back then. So let do that now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8609) Remove depency of hadoop to internals (Cell/CellName)
[ https://issues.apache.org/jira/browse/CASSANDRA-8609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14549139#comment-14549139 ] Philip Thompson commented on CASSANDRA-8609: Waiting on test results from cassci, but here is the patch: Squashed: https://github.com/ptnapoleon/cassandra/tree/8609-squashed-notest Normal: https://github.com/ptnapoleon/cassandra/tree/cassandra-8609 This is ready for review, but not for commit. I had to merge in CASSANDRA-9410 due to build issues. Once/if that is committed, I will merge cassandra-2.2 back into my branch. > Remove depency of hadoop to internals (Cell/CellName) > - > > Key: CASSANDRA-8609 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8609 > Project: Cassandra > Issue Type: Bug >Reporter: Sylvain Lebresne >Assignee: Philip Thompson > Fix For: 2.2 rc1 > > Attachments: 8609-2.2.txt, CASSANDRA-8609-3.0-branch.txt > > > For some reason most of the Hadoop code (ColumnFamilyRecordReader, > CqlStorage, ...) uses the {{Cell}} and {{CellName}} classes. That dependency > is entirely artificial: all this code is really client code that communicate > with Cassandra over thrift/native protocol and there is thus no reason for it > to use internal classes. And in fact, thoses classes are used in a very crude > way, as a {{Pair}} really. > But this dependency is really painful when we make changes to the internals. > Further, every time we do so, I believe we break some of those the APIs due > to the change. This has been painful for CASSANDRA-5417 and this is now > painful for CASSANDRA-8099. But while I somewhat hack over it in > CASSANDRA-5417, this was a mistake and we should have removed the depency > back then. So let do that now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8609) Remove depency of hadoop to internals (Cell/CellName)
[ https://issues.apache.org/jira/browse/CASSANDRA-8609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14548154#comment-14548154 ] Aleksey Yeschenko commented on CASSANDRA-8609: -- With Thrift-based Hadoop code going away in 3.0/CASSANDRA-9353 it's important that the 2.2 versions of it can work without any internal dependencies (so that you can use the 2.2 versions with 3.0, if you need to). > Remove depency of hadoop to internals (Cell/CellName) > - > > Key: CASSANDRA-8609 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8609 > Project: Cassandra > Issue Type: Bug >Reporter: Sylvain Lebresne >Assignee: Philip Thompson > Fix For: 2.2 rc1 > > Attachments: CASSANDRA-8609-3.0-branch.txt > > > For some reason most of the Hadoop code (ColumnFamilyRecordReader, > CqlStorage, ...) uses the {{Cell}} and {{CellName}} classes. That dependency > is entirely artificial: all this code is really client code that communicate > with Cassandra over thrift/native protocol and there is thus no reason for it > to use internal classes. And in fact, thoses classes are used in a very crude > way, as a {{Pair}} really. > But this dependency is really painful when we make changes to the internals. > Further, every time we do so, I believe we break some of those the APIs due > to the change. This has been painful for CASSANDRA-5417 and this is now > painful for CASSANDRA-8099. But while I somewhat hack over it in > CASSANDRA-5417, this was a mistake and we should have removed the depency > back then. So let do that now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8609) Remove depency of hadoop to internals (Cell/CellName)
[ https://issues.apache.org/jira/browse/CASSANDRA-8609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14502788#comment-14502788 ] Philip Thompson commented on CASSANDRA-8609: This is not a duplicate, we will need something additional on top of CASSANDRA-8358. > Remove depency of hadoop to internals (Cell/CellName) > - > > Key: CASSANDRA-8609 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8609 > Project: Cassandra > Issue Type: Bug >Reporter: Sylvain Lebresne >Assignee: Philip Thompson > Fix For: 3.0 > > Attachments: CASSANDRA-8609-3.0-branch.txt > > > For some reason most of the Hadoop code (ColumnFamilyRecordReader, > CqlStorage, ...) uses the {{Cell}} and {{CellName}} classes. That dependency > is entirely artificial: all this code is really client code that communicate > with Cassandra over thrift/native protocol and there is thus no reason for it > to use internal classes. And in fact, thoses classes are used in a very crude > way, as a {{Pair}} really. > But this dependency is really painful when we make changes to the internals. > Further, every time we do so, I believe we break some of those the APIs due > to the change. This has been painful for CASSANDRA-5417 and this is now > painful for CASSANDRA-8099. But while I somewhat hack over it in > CASSANDRA-5417, this was a mistake and we should have removed the depency > back then. So let do that now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8609) Remove depency of hadoop to internals (Cell/CellName)
[ https://issues.apache.org/jira/browse/CASSANDRA-8609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14500199#comment-14500199 ] Sylvain Lebresne commented on CASSANDRA-8609: - [~philipthompson] So is this just a duplicate of CASSANDRA-8358 or will we need something more for this? > Remove depency of hadoop to internals (Cell/CellName) > - > > Key: CASSANDRA-8609 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8609 > Project: Cassandra > Issue Type: Bug >Reporter: Sylvain Lebresne >Assignee: Philip Thompson > Fix For: 3.0 > > Attachments: CASSANDRA-8609-3.0-branch.txt > > > For some reason most of the Hadoop code (ColumnFamilyRecordReader, > CqlStorage, ...) uses the {{Cell}} and {{CellName}} classes. That dependency > is entirely artificial: all this code is really client code that communicate > with Cassandra over thrift/native protocol and there is thus no reason for it > to use internal classes. And in fact, thoses classes are used in a very crude > way, as a {{Pair}} really. > But this dependency is really painful when we make changes to the internals. > Further, every time we do so, I believe we break some of those the APIs due > to the change. This has been painful for CASSANDRA-5417 and this is now > painful for CASSANDRA-8099. But while I somewhat hack over it in > CASSANDRA-5417, this was a mistake and we should have removed the depency > back then. So let do that now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8609) Remove depency of hadoop to internals (Cell/CellName)
[ https://issues.apache.org/jira/browse/CASSANDRA-8609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14319176#comment-14319176 ] Alex Liu commented on CASSANDRA-8609: - All pig tests fail, and CASSANDRA-8358 is addressing the issue. I attach my patch on cassandra-3.0 as a reference for [~philipthompson]. He will take over this ticket and address it on CASSANDRA-8358. > Remove depency of hadoop to internals (Cell/CellName) > - > > Key: CASSANDRA-8609 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8609 > Project: Cassandra > Issue Type: Bug >Reporter: Sylvain Lebresne >Assignee: Alex Liu > Fix For: 3.0 > > > For some reason most of the Hadoop code (ColumnFamilyRecordReader, > CqlStorage, ...) uses the {{Cell}} and {{CellName}} classes. That dependency > is entirely artificial: all this code is really client code that communicate > with Cassandra over thrift/native protocol and there is thus no reason for it > to use internal classes. And in fact, thoses classes are used in a very crude > way, as a {{Pair}} really. > But this dependency is really painful when we make changes to the internals. > Further, every time we do so, I believe we break some of those the APIs due > to the change. This has been painful for CASSANDRA-5417 and this is now > painful for CASSANDRA-8099. But while I somewhat hack over it in > CASSANDRA-5417, this was a mistake and we should have removed the depency > back then. So let do that now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8609) Remove depency of hadoop to internals (Cell/CellName)
[ https://issues.apache.org/jira/browse/CASSANDRA-8609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14319162#comment-14319162 ] Philip Thompson commented on CASSANDRA-8609: Alex, I believe most of this will be covered by CASSANDRA-8358. At the very least the pig-test failures are fixed there. > Remove depency of hadoop to internals (Cell/CellName) > - > > Key: CASSANDRA-8609 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8609 > Project: Cassandra > Issue Type: Bug >Reporter: Sylvain Lebresne >Assignee: Alex Liu > Fix For: 3.0 > > > For some reason most of the Hadoop code (ColumnFamilyRecordReader, > CqlStorage, ...) uses the {{Cell}} and {{CellName}} classes. That dependency > is entirely artificial: all this code is really client code that communicate > with Cassandra over thrift/native protocol and there is thus no reason for it > to use internal classes. And in fact, thoses classes are used in a very crude > way, as a {{Pair}} really. > But this dependency is really painful when we make changes to the internals. > Further, every time we do so, I believe we break some of those the APIs due > to the change. This has been painful for CASSANDRA-5417 and this is now > painful for CASSANDRA-8099. But while I somewhat hack over it in > CASSANDRA-5417, this was a mistake and we should have removed the depency > back then. So let do that now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8609) Remove depency of hadoop to internals (Cell/CellName)
[ https://issues.apache.org/jira/browse/CASSANDRA-8609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14319020#comment-14319020 ] Alex Liu commented on CASSANDRA-8609: - I tested pig-test on trunk and found some failed test cases, I am fixing those in this ticket as well. > Remove depency of hadoop to internals (Cell/CellName) > - > > Key: CASSANDRA-8609 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8609 > Project: Cassandra > Issue Type: Bug >Reporter: Sylvain Lebresne >Assignee: Alex Liu > Fix For: 3.0 > > > For some reason most of the Hadoop code (ColumnFamilyRecordReader, > CqlStorage, ...) uses the {{Cell}} and {{CellName}} classes. That dependency > is entirely artificial: all this code is really client code that communicate > with Cassandra over thrift/native protocol and there is thus no reason for it > to use internal classes. And in fact, thoses classes are used in a very crude > way, as a {{Pair}} really. > But this dependency is really painful when we make changes to the internals. > Further, every time we do so, I believe we break some of those the APIs due > to the change. This has been painful for CASSANDRA-5417 and this is now > painful for CASSANDRA-8099. But while I somewhat hack over it in > CASSANDRA-5417, this was a mistake and we should have removed the depency > back then. So let do that now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8609) Remove depency of hadoop to internals (Cell/CellName)
[ https://issues.apache.org/jira/browse/CASSANDRA-8609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14318521#comment-14318521 ] Sylvain Lebresne commented on CASSANDRA-8609: - bq. Do this ticket only remove Cell and CellName from any of Hadoop related class? Basically, yes. bq. Sorry, I miss this ticket. I am working on it today or tomorrow to get it done. No problem, thanks. > Remove depency of hadoop to internals (Cell/CellName) > - > > Key: CASSANDRA-8609 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8609 > Project: Cassandra > Issue Type: Bug >Reporter: Sylvain Lebresne >Assignee: Alex Liu > Fix For: 3.0 > > > For some reason most of the Hadoop code (ColumnFamilyRecordReader, > CqlStorage, ...) uses the {{Cell}} and {{CellName}} classes. That dependency > is entirely artificial: all this code is really client code that communicate > with Cassandra over thrift/native protocol and there is thus no reason for it > to use internal classes. And in fact, thoses classes are used in a very crude > way, as a {{Pair}} really. > But this dependency is really painful when we make changes to the internals. > Further, every time we do so, I believe we break some of those the APIs due > to the change. This has been painful for CASSANDRA-5417 and this is now > painful for CASSANDRA-8099. But while I somewhat hack over it in > CASSANDRA-5417, this was a mistake and we should have removed the depency > back then. So let do that now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8609) Remove depency of hadoop to internals (Cell/CellName)
[ https://issues.apache.org/jira/browse/CASSANDRA-8609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14318501#comment-14318501 ] Alex Liu commented on CASSANDRA-8609: - Sorry, I miss this ticket. I am working on it today or tomorrow to get it done. Do this ticket only remove Cell and CellName from any of Hadoop related class? > Remove depency of hadoop to internals (Cell/CellName) > - > > Key: CASSANDRA-8609 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8609 > Project: Cassandra > Issue Type: Bug >Reporter: Sylvain Lebresne >Assignee: Alex Liu > Fix For: 3.0 > > > For some reason most of the Hadoop code (ColumnFamilyRecordReader, > CqlStorage, ...) uses the {{Cell}} and {{CellName}} classes. That dependency > is entirely artificial: all this code is really client code that communicate > with Cassandra over thrift/native protocol and there is thus no reason for it > to use internal classes. And in fact, thoses classes are used in a very crude > way, as a {{Pair}} really. > But this dependency is really painful when we make changes to the internals. > Further, every time we do so, I believe we break some of those the APIs due > to the change. This has been painful for CASSANDRA-5417 and this is now > painful for CASSANDRA-8099. But while I somewhat hack over it in > CASSANDRA-5417, this was a mistake and we should have removed the depency > back then. So let do that now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8609) Remove depency of hadoop to internals (Cell/CellName)
[ https://issues.apache.org/jira/browse/CASSANDRA-8609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14317851#comment-14317851 ] Sylvain Lebresne commented on CASSANDRA-8609: - [~alexliu68] Any chance you'll be able to work on this soonish? Otherwise we'll re-assign as we kind of want that for 3.0. > Remove depency of hadoop to internals (Cell/CellName) > - > > Key: CASSANDRA-8609 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8609 > Project: Cassandra > Issue Type: Bug >Reporter: Sylvain Lebresne >Assignee: Alex Liu > Fix For: 3.0 > > > For some reason most of the Hadoop code (ColumnFamilyRecordReader, > CqlStorage, ...) uses the {{Cell}} and {{CellName}} classes. That dependency > is entirely artificial: all this code is really client code that communicate > with Cassandra over thrift/native protocol and there is thus no reason for it > to use internal classes. And in fact, thoses classes are used in a very crude > way, as a {{Pair}} really. > But this dependency is really painful when we make changes to the internals. > Further, every time we do so, I believe we break some of those the APIs due > to the change. This has been painful for CASSANDRA-5417 and this is now > painful for CASSANDRA-8099. But while I somewhat hack over it in > CASSANDRA-5417, this was a mistake and we should have removed the depency > back then. So let do that now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)