[jira] [Commented] (CASSANDRA-8358) Bundled tools shouldn't be using Thrift API

Philip Thompson (JIRA) Wed, 28 Jan 2015 17:14:41 -0800

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-8358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14296200#comment-14296200
 ]


Philip Thompson commented on CASSANDRA-8358:
--------------------------------------------

Here is my current branch: https://github.com/ptnapoleon/cassandra/compare/8358
Sorry about the WIP pushed changes to BulkLoader, ignore those for now. I have 
recently received a JAR of the a tentative 2.1.5 of the driver containing 
JAVA-312, so I can finish work on this now.

I was having an issue where the Thread calling the java driver's connect() was 
being interrupted, which was causing the connect() to fail. Currently I check 
for Thread.interrupted() and retry if that is the reason for the failure. I am 
not sure how to prevent the interruption in the first place.

Currently when running pig-test, only one test that uses CqlNativeStorage is 
failing, and that is testCqlNativeStorageCollectionColumnTable. 
This is due to the following problem:
{code}
java.lang.IllegalArgumentException
        at java.nio.Buffer.limit(Buffer.java:267)
        at 
org.apache.cassandra.utils.ByteBufferUtil.readBytes(ByteBufferUtil.java:552)
        at 
org.apache.cassandra.utils.ByteBufferUtil.readBytesWithShortLength(ByteBufferUtil.java:561)
        at 
org.apache.cassandra.serializers.CollectionSerializer.readValue(CollectionSerializer.java:118)
        at 
org.apache.cassandra.serializers.MapSerializer.deserializeForNativeProtocol(MapSerializer.java:100)
        at org.apache.cassandra.cql3.Maps$Value.fromSerialized(Maps.java:164)
        at org.apache.cassandra.cql3.Maps$Marker.bind(Maps.java:273)
        at org.apache.cassandra.cql3.Maps$Marker.bind(Maps.java:262)
        at org.apache.cassandra.cql3.Maps$Putter.doPut(Maps.java:355)
        at org.apache.cassandra.cql3.Maps$Setter.execute(Maps.java:292)
        at 
org.apache.cassandra.cql3.statements.UpdateStatement.addUpdateForKey(UpdateStatement.java:98)
        at 
org.apache.cassandra.cql3.statements.ModificationStatement.getMutations(ModificationStatement.java:655)
        at 
org.apache.cassandra.cql3.statements.ModificationStatement.executeWithoutCondition(ModificationStatement.java:487)
        at 
org.apache.cassandra.cql3.statements.ModificationStatement.execute(ModificationStatement.java:473)
        at 
org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:233)
        at 
org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:443)
        at 
org.apache.cassandra.transport.messages.ExecuteMessage.execute(ExecuteMessage.java:134)
        at 
org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:439)
        at 
org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:335)
        at 
io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
        at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
        at 
io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:32)
        at 
io.netty.channel.AbstractChannelHandlerContext$8.run(AbstractChannelHandlerContext.java:324)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at 
org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164)
        at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105)
        at java.lang.Thread.run(Thread.java:745)
{code}
This is erroring because in CollectionSerializer.readValue
{code}
    public static ByteBuffer readValue(ByteBuffer input, int version)
    {
        if (version >= Server.VERSION_3)
        {
            int size = input.getInt();
            if (size < 0)
                return null;

            return ByteBufferUtil.readBytes(input, size);
        }
        else
        {
            return ByteBufferUtil.readBytesWithShortLength(input);
        }
    }
{code}
The value of size from input.getInt() is an integer in the millions for one of 
the map values. I am still figuring out what is differing from cassandra-2.1 
where the test is passing without my changes, but the ByteBuffer itself doesn't 
appear to be different.

In  CqlConfigHelper, should I be creating an OUTPUT_* property for each INPUT_* 
property?

PigTestBase should be switched over to using the java driver, but I would 
rather handle that in a separate ticket.
AbstractCassandraStorage may be deprecated for 3.0, but it is not working at 
all with the current schema parsing queries. That also belongs in a separate 
ticket.

> Bundled tools shouldn't be using Thrift API
> -------------------------------------------
>
>                 Key: CASSANDRA-8358
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8358
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Aleksey Yeschenko
>            Assignee: Philip Thompson
>             Fix For: 3.0
>
>
> In 2.1, we switched cqlsh to the python-driver.
> In 3.0, we got rid of cassandra-cli.
> Yet there is still code that's using legacy Thrift API. We want to convert it 
> all to use the java-driver instead.
> 1. BulkLoader uses Thrift to query the schema tables. It should be using 
> java-driver metadata APIs directly instead.
> 2. o.a.c.hadoop.cql3.CqlRecordWriter is using Thrift
> 3. o.a.c.hadoop.ColumnFamilyRecordReader is using Thrift
> 4. o.a.c.hadoop.AbstractCassandraStorage is using Thrift
> 5. o.a.c.hadoop.pig.CqlStorage is using Thrift
> Some of the things listed above use Thrift to get the list of partition key 
> columns or clustering columns. Those should be converted to use the Metadata 
> API of the java-driver.
> Somewhat related to that, we also have badly ported code from Thrift in 
> o.a.c.hadoop.cql3.CqlRecordReader (see fetchKeys()) that manually fetches 
> columns from schema tables instead of properly using the driver's Metadata 
> API.
> We need all of it fixed. One exception, for now, is 
> o.a.c.hadoop.AbstractColumnFamilyInputFormat - it's using Thrift for its 
> describe_splits_ex() call that cannot be currently replaced by any 
> java-driver call (?).
> Once this is done, we can stop starting Thrift RPC port by default in 
> cassandra.yaml.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8358) Bundled tools shouldn't be using Thrift API

Reply via email to