[jira] [Commented] (FLINK-3311) Add a connector for streaming data into Cassandra

ASF GitHub Bot (JIRA) Thu, 05 May 2016 07:38:38 -0700

    [ 
https://issues.apache.org/jira/browse/FLINK-3311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15272454#comment-15272454
 ]


ASF GitHub Bot commented on FLINK-3311:
---------------------------------------

Github user rmetzger commented on the pull request:

    https://github.com/apache/flink/pull/1771#issuecomment-217171304
  
    I just tried the PR, but the recovery after a failure doesn't seem to work:
    
    ```
    java.lang.RuntimeException: Error triggering a checkpoint as the result of 
receiving checkpoint barrier
        at 
org.apache.flink.streaming.runtime.tasks.StreamTask$2.onEvent(StreamTask.java:681)
        at 
org.apache.flink.streaming.runtime.tasks.StreamTask$2.onEvent(StreamTask.java:674)
        at 
org.apache.flink.streaming.runtime.io.BarrierBuffer.processBarrier(BarrierBuffer.java:203)
        at 
org.apache.flink.streaming.runtime.io.BarrierBuffer.getNextNonBlocked(BarrierBuffer.java:129)
        at 
org.apache.flink.streaming.runtime.io.StreamInputProcessor.processInput(StreamInputProcessor.java:175)
        at 
org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.run(OneInputStreamTask.java:65)
        at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:224)
        at org.apache.flink.runtime.taskmanager.Task.run(Task.java:559)
        at java.lang.Thread.run(Thread.java:745)
    Caused by: java.lang.RuntimeException: Failed to fetch state handle size
        at 
org.apache.flink.runtime.taskmanager.RuntimeEnvironment.acknowledgeCheckpoint(RuntimeEnvironment.java:234)
        at 
org.apache.flink.streaming.runtime.tasks.StreamTask.triggerCheckpoint(StreamTask.java:511)
        at 
org.apache.flink.streaming.runtime.tasks.StreamTask$2.onEvent(StreamTask.java:678)
        ... 8 more
    Caused by: java.io.FileNotFoundException: File does not exist: 
hdfs://nameservice1/user/robert/cassandra-fs/e70d0b78b7875877f42a8ebfba463f14/chk-0/9f892bc0-b5e2-484f-a981-6e666e7ad897
        at 
org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1122)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1114)
        at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1114)
        at 
org.apache.flink.runtime.fs.hdfs.HadoopFileSystem.getFileStatus(HadoopFileSystem.java:351)
        at 
org.apache.flink.runtime.state.filesystem.AbstractFileStateHandle.getFileSize(AbstractFileStateHandle.java:93)
        at 
org.apache.flink.runtime.state.filesystem.FileStreamStateHandle.getStateSize(FileStreamStateHandle.java:58)
        at 
org.apache.flink.runtime.state.AbstractStateBackend$DataInputViewHandle.getStateSize(AbstractStateBackend.java:428)
        at 
org.apache.flink.streaming.runtime.operators.GenericAtLeastOnceSink$ExactlyOnceState.getStateSize(GenericAtLeastOnceSink.java:190)
        at 
org.apache.flink.streaming.runtime.tasks.StreamTaskStateList.getStateSize(StreamTaskStateList.java:81)
        at 
org.apache.flink.runtime.taskmanager.RuntimeEnvironment.acknowledgeCheckpoint(RuntimeEnvironment.java:231)
        ... 10 more
    ```


> Add a connector for streaming data into Cassandra
> -------------------------------------------------
>
>                 Key: FLINK-3311
>                 URL: https://issues.apache.org/jira/browse/FLINK-3311
>             Project: Flink
>          Issue Type: New Feature
>          Components: Streaming Connectors
>            Reporter: Robert Metzger
>            Assignee: Andrea Sella
>
> We had users in the past asking for a Flink+Cassandra integration.
> It seems that there is a well-developed java client for connecting into 
> Cassandra: https://github.com/datastax/java-driver (ASL 2.0)
> There are also tutorials out there on how to start a local cassandra instance 
> (for the tests): 
> http://prettyprint.me/prettyprint.me/2010/02/14/running-cassandra-as-an-embedded-service/index.html
> For the data types, I think we should support TupleX types, and map standard 
> java types to the respective cassandra types.
> In addition, it seems that there is a object mapper from datastax to store 
> POJOs in Cassandra (there are annotations for defining the primary key and 
> types)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-3311) Add a connector for streaming data into Cassandra

Reply via email to