[ https://issues.apache.org/jira/browse/CASSANDRA-8358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14296200#comment-14296200 ]
Philip Thompson commented on CASSANDRA-8358: -------------------------------------------- Here is my current branch: https://github.com/ptnapoleon/cassandra/compare/8358 Sorry about the WIP pushed changes to BulkLoader, ignore those for now. I have recently received a JAR of the a tentative 2.1.5 of the driver containing JAVA-312, so I can finish work on this now. I was having an issue where the Thread calling the java driver's connect() was being interrupted, which was causing the connect() to fail. Currently I check for Thread.interrupted() and retry if that is the reason for the failure. I am not sure how to prevent the interruption in the first place. Currently when running pig-test, only one test that uses CqlNativeStorage is failing, and that is testCqlNativeStorageCollectionColumnTable. This is due to the following problem: {code} java.lang.IllegalArgumentException at java.nio.Buffer.limit(Buffer.java:267) at org.apache.cassandra.utils.ByteBufferUtil.readBytes(ByteBufferUtil.java:552) at org.apache.cassandra.utils.ByteBufferUtil.readBytesWithShortLength(ByteBufferUtil.java:561) at org.apache.cassandra.serializers.CollectionSerializer.readValue(CollectionSerializer.java:118) at org.apache.cassandra.serializers.MapSerializer.deserializeForNativeProtocol(MapSerializer.java:100) at org.apache.cassandra.cql3.Maps$Value.fromSerialized(Maps.java:164) at org.apache.cassandra.cql3.Maps$Marker.bind(Maps.java:273) at org.apache.cassandra.cql3.Maps$Marker.bind(Maps.java:262) at org.apache.cassandra.cql3.Maps$Putter.doPut(Maps.java:355) at org.apache.cassandra.cql3.Maps$Setter.execute(Maps.java:292) at org.apache.cassandra.cql3.statements.UpdateStatement.addUpdateForKey(UpdateStatement.java:98) at org.apache.cassandra.cql3.statements.ModificationStatement.getMutations(ModificationStatement.java:655) at org.apache.cassandra.cql3.statements.ModificationStatement.executeWithoutCondition(ModificationStatement.java:487) at org.apache.cassandra.cql3.statements.ModificationStatement.execute(ModificationStatement.java:473) at org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:233) at org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:443) at org.apache.cassandra.transport.messages.ExecuteMessage.execute(ExecuteMessage.java:134) at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:439) at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:335) at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333) at io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:32) at io.netty.channel.AbstractChannelHandlerContext$8.run(AbstractChannelHandlerContext.java:324) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164) at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) at java.lang.Thread.run(Thread.java:745) {code} This is erroring because in CollectionSerializer.readValue {code} public static ByteBuffer readValue(ByteBuffer input, int version) { if (version >= Server.VERSION_3) { int size = input.getInt(); if (size < 0) return null; return ByteBufferUtil.readBytes(input, size); } else { return ByteBufferUtil.readBytesWithShortLength(input); } } {code} The value of size from input.getInt() is an integer in the millions for one of the map values. I am still figuring out what is differing from cassandra-2.1 where the test is passing without my changes, but the ByteBuffer itself doesn't appear to be different. In CqlConfigHelper, should I be creating an OUTPUT_* property for each INPUT_* property? PigTestBase should be switched over to using the java driver, but I would rather handle that in a separate ticket. AbstractCassandraStorage may be deprecated for 3.0, but it is not working at all with the current schema parsing queries. That also belongs in a separate ticket. > Bundled tools shouldn't be using Thrift API > ------------------------------------------- > > Key: CASSANDRA-8358 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8358 > Project: Cassandra > Issue Type: Improvement > Reporter: Aleksey Yeschenko > Assignee: Philip Thompson > Fix For: 3.0 > > > In 2.1, we switched cqlsh to the python-driver. > In 3.0, we got rid of cassandra-cli. > Yet there is still code that's using legacy Thrift API. We want to convert it > all to use the java-driver instead. > 1. BulkLoader uses Thrift to query the schema tables. It should be using > java-driver metadata APIs directly instead. > 2. o.a.c.hadoop.cql3.CqlRecordWriter is using Thrift > 3. o.a.c.hadoop.ColumnFamilyRecordReader is using Thrift > 4. o.a.c.hadoop.AbstractCassandraStorage is using Thrift > 5. o.a.c.hadoop.pig.CqlStorage is using Thrift > Some of the things listed above use Thrift to get the list of partition key > columns or clustering columns. Those should be converted to use the Metadata > API of the java-driver. > Somewhat related to that, we also have badly ported code from Thrift in > o.a.c.hadoop.cql3.CqlRecordReader (see fetchKeys()) that manually fetches > columns from schema tables instead of properly using the driver's Metadata > API. > We need all of it fixed. One exception, for now, is > o.a.c.hadoop.AbstractColumnFamilyInputFormat - it's using Thrift for its > describe_splits_ex() call that cannot be currently replaced by any > java-driver call (?). > Once this is done, we can stop starting Thrift RPC port by default in > cassandra.yaml. -- This message was sent by Atlassian JIRA (v6.3.4#6332)