[jira] [Commented] (CASSANDRA-6311) Add CqlRecordReader to take advantage of native CQL pagination
[ https://issues.apache.org/jira/browse/CASSANDRA-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13960050#comment-13960050 ] Shridhar commented on CASSANDRA-6311: - [~alexliu68] We downloaded cassandra-2.0.6 and added patch (6311-v11.txt) on top of this. Still we are getting the same error as in CASSANDRA-6151. Add CqlRecordReader to take advantage of native CQL pagination -- Key: CASSANDRA-6311 URL: https://issues.apache.org/jira/browse/CASSANDRA-6311 Project: Cassandra Issue Type: New Feature Components: Hadoop Reporter: Alex Liu Assignee: Alex Liu Fix For: 2.0.7 Attachments: 6311-v10.txt, 6311-v11.txt, 6311-v3-2.0-branch.txt, 6311-v4.txt, 6311-v5-2.0-branch.txt, 6311-v6-2.0-branch.txt, 6311-v7.txt, 6311-v8.txt, 6311-v9.txt, 6331-2.0-branch.txt, 6331-v2-2.0-branch.txt Since the latest Cql pagination is done and it should be more efficient, so we need update CqlPagingRecordReader to use it instead of the custom thrift paging. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6311) Add CqlRecordReader to take advantage of native CQL pagination
[ https://issues.apache.org/jira/browse/CASSANDRA-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13950429#comment-13950429 ] Alex Liu commented on CASSANDRA-6311: - It's scheduled for the next release. I am not sure when it will be released. You can build your own build/release on cassandra-2.0 branch if you can't wait. Add CqlRecordReader to take advantage of native CQL pagination -- Key: CASSANDRA-6311 URL: https://issues.apache.org/jira/browse/CASSANDRA-6311 Project: Cassandra Issue Type: New Feature Components: Hadoop Reporter: Alex Liu Assignee: Alex Liu Fix For: 2.0.7 Attachments: 6311-v10.txt, 6311-v11.txt, 6311-v3-2.0-branch.txt, 6311-v4.txt, 6311-v5-2.0-branch.txt, 6311-v6-2.0-branch.txt, 6311-v7.txt, 6311-v8.txt, 6311-v9.txt, 6331-2.0-branch.txt, 6331-v2-2.0-branch.txt Since the latest Cql pagination is done and it should be more efficient, so we need update CqlPagingRecordReader to use it instead of the custom thrift paging. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6311) Add CqlRecordReader to take advantage of native CQL pagination
[ https://issues.apache.org/jira/browse/CASSANDRA-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13949153#comment-13949153 ] Shridhar commented on CASSANDRA-6311: - When might Cassandra 2.0.7 version get release ? Add CqlRecordReader to take advantage of native CQL pagination -- Key: CASSANDRA-6311 URL: https://issues.apache.org/jira/browse/CASSANDRA-6311 Project: Cassandra Issue Type: New Feature Components: Hadoop Reporter: Alex Liu Assignee: Alex Liu Fix For: 2.0.7 Attachments: 6311-v10.txt, 6311-v11.txt, 6311-v3-2.0-branch.txt, 6311-v4.txt, 6311-v5-2.0-branch.txt, 6311-v6-2.0-branch.txt, 6311-v7.txt, 6311-v8.txt, 6311-v9.txt, 6331-2.0-branch.txt, 6331-v2-2.0-branch.txt Since the latest Cql pagination is done and it should be more efficient, so we need update CqlPagingRecordReader to use it instead of the custom thrift paging. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6311) Add CqlRecordReader to take advantage of native CQL pagination
[ https://issues.apache.org/jira/browse/CASSANDRA-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13946479#comment-13946479 ] Sylvain Lebresne commented on CASSANDRA-6311: - So is there a problem with what has been committed here? And if there is, is someone working on some fixup (or planning to)? Add CqlRecordReader to take advantage of native CQL pagination -- Key: CASSANDRA-6311 URL: https://issues.apache.org/jira/browse/CASSANDRA-6311 Project: Cassandra Issue Type: New Feature Components: Hadoop Reporter: Alex Liu Assignee: Alex Liu Fix For: 2.0.7 Attachments: 6311-v10.txt, 6311-v3-2.0-branch.txt, 6311-v4.txt, 6311-v5-2.0-branch.txt, 6311-v6-2.0-branch.txt, 6311-v7.txt, 6311-v8.txt, 6311-v9.txt, 6331-2.0-branch.txt, 6331-v2-2.0-branch.txt Since the latest Cql pagination is done and it should be more efficient, so we need update CqlPagingRecordReader to use it instead of the custom thrift paging. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6311) Add CqlRecordReader to take advantage of native CQL pagination
[ https://issues.apache.org/jira/browse/CASSANDRA-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13946634#comment-13946634 ] Dave Brosius commented on CASSANDRA-6311: - yes problems. It seems to me the next method signature needs to be something like public PairLong, Row next() throws IOException Add CqlRecordReader to take advantage of native CQL pagination -- Key: CASSANDRA-6311 URL: https://issues.apache.org/jira/browse/CASSANDRA-6311 Project: Cassandra Issue Type: New Feature Components: Hadoop Reporter: Alex Liu Assignee: Alex Liu Fix For: 2.0.7 Attachments: 6311-v10.txt, 6311-v3-2.0-branch.txt, 6311-v4.txt, 6311-v5-2.0-branch.txt, 6311-v6-2.0-branch.txt, 6311-v7.txt, 6311-v8.txt, 6311-v9.txt, 6331-2.0-branch.txt, 6331-v2-2.0-branch.txt Since the latest Cql pagination is done and it should be more efficient, so we need update CqlPagingRecordReader to use it instead of the custom thrift paging. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6311) Add CqlRecordReader to take advantage of native CQL pagination
[ https://issues.apache.org/jira/browse/CASSANDRA-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13946681#comment-13946681 ] Alex Liu commented on CASSANDRA-6311: - public boolean next(Long key, Row value) throws IOException signature is fine. The problem is Old Hadoop uses {code} public Long createKey() public Row createValue() {code} to create key and value object, then use {code} public boolean next(Long key, Row value) {code} to set the properties of the key, and value objects Because of Long is wrapper of long, there is no way to set long value of key. But since it's row count number, it's not useful anyway, so we can ignore it. But Row is backed by ArrayBackedRow, a protected class of Cassandra java driver. It has two properties, {code} private final ColumnDefinitions metadata; private final ListByteBuffer data; {code} Which should be opened up, so the properties can be set. Also it's a protected class, it should be made to public. The fix looks like {code} public Row createValue() { return new ArrayBackedRow(null, null); } public boolean next(Long key, Row value) throws IOException { if (nextKeyValue()) { value.setColumnDefinitions(getCurrentValue().getColumnDefinitions()); value.setData(getCurrentValue().getData()); return true; } return false; } {code} Add CqlRecordReader to take advantage of native CQL pagination -- Key: CASSANDRA-6311 URL: https://issues.apache.org/jira/browse/CASSANDRA-6311 Project: Cassandra Issue Type: New Feature Components: Hadoop Reporter: Alex Liu Assignee: Alex Liu Fix For: 2.0.7 Attachments: 6311-v10.txt, 6311-v3-2.0-branch.txt, 6311-v4.txt, 6311-v5-2.0-branch.txt, 6311-v6-2.0-branch.txt, 6311-v7.txt, 6311-v8.txt, 6311-v9.txt, 6331-2.0-branch.txt, 6331-v2-2.0-branch.txt Since the latest Cql pagination is done and it should be more efficient, so we need update CqlPagingRecordReader to use it instead of the custom thrift paging. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6311) Add CqlRecordReader to take advantage of native CQL pagination
[ https://issues.apache.org/jira/browse/CASSANDRA-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13946684#comment-13946684 ] Alex Liu commented on CASSANDRA-6311: - [~slebresne] Should we open a ticket for Cassandra java driver to open up ArrayBackedRow? Add CqlRecordReader to take advantage of native CQL pagination -- Key: CASSANDRA-6311 URL: https://issues.apache.org/jira/browse/CASSANDRA-6311 Project: Cassandra Issue Type: New Feature Components: Hadoop Reporter: Alex Liu Assignee: Alex Liu Fix For: 2.0.7 Attachments: 6311-v10.txt, 6311-v3-2.0-branch.txt, 6311-v4.txt, 6311-v5-2.0-branch.txt, 6311-v6-2.0-branch.txt, 6311-v7.txt, 6311-v8.txt, 6311-v9.txt, 6331-2.0-branch.txt, 6331-v2-2.0-branch.txt Since the latest Cql pagination is done and it should be more efficient, so we need update CqlPagingRecordReader to use it instead of the custom thrift paging. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6311) Add CqlRecordReader to take advantage of native CQL pagination
[ https://issues.apache.org/jira/browse/CASSANDRA-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13946714#comment-13946714 ] Sylvain Lebresne commented on CASSANDRA-6311: - I don't really understand what's going on here but something is not right so I've reverted this for now. bq. Which should be opened up, so the properties can be set. If you mean that you want the java driver to add some {{setColumnDefintions}} and {{setData}} to the it's {{Row}} object, then there is no chance of this happening: that would make sense for the driver from an API point of view, {{Row}} is very much an immutable object on purpose. That being said, if you really really want, {{Row}} is actually an interface in the driver, so one can write some mutable implementation if he really wants to. But I'm a bit surprised the Hadoop API forces you to go there tbh. Add CqlRecordReader to take advantage of native CQL pagination -- Key: CASSANDRA-6311 URL: https://issues.apache.org/jira/browse/CASSANDRA-6311 Project: Cassandra Issue Type: New Feature Components: Hadoop Reporter: Alex Liu Assignee: Alex Liu Fix For: 2.0.7 Attachments: 6311-v10.txt, 6311-v3-2.0-branch.txt, 6311-v4.txt, 6311-v5-2.0-branch.txt, 6311-v6-2.0-branch.txt, 6311-v7.txt, 6311-v8.txt, 6311-v9.txt, 6331-2.0-branch.txt, 6331-v2-2.0-branch.txt Since the latest Cql pagination is done and it should be more efficient, so we need update CqlPagingRecordReader to use it instead of the custom thrift paging. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6311) Add CqlRecordReader to take advantage of native CQL pagination
[ https://issues.apache.org/jira/browse/CASSANDRA-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13946722#comment-13946722 ] Piotr Kołaczkowski commented on CASSANDRA-6311: --- This is the old, deprecated Hadoop API. Can we just drop support for it? Making Row mutable in the driver is a no-go; however you could use some reflection magic to set a final private field. But you could not do that safely with a Long, so it would be still slightly broken anyway (but usable). Add CqlRecordReader to take advantage of native CQL pagination -- Key: CASSANDRA-6311 URL: https://issues.apache.org/jira/browse/CASSANDRA-6311 Project: Cassandra Issue Type: New Feature Components: Hadoop Reporter: Alex Liu Assignee: Alex Liu Fix For: 2.0.7 Attachments: 6311-v10.txt, 6311-v3-2.0-branch.txt, 6311-v4.txt, 6311-v5-2.0-branch.txt, 6311-v6-2.0-branch.txt, 6311-v7.txt, 6311-v8.txt, 6311-v9.txt, 6331-2.0-branch.txt, 6331-v2-2.0-branch.txt Since the latest Cql pagination is done and it should be more efficient, so we need update CqlPagingRecordReader to use it instead of the custom thrift paging. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6311) Add CqlRecordReader to take advantage of native CQL pagination
[ https://issues.apache.org/jira/browse/CASSANDRA-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13946784#comment-13946784 ] Alex Liu commented on CASSANDRA-6311: - I can create a wrapper class around Row, so we don't modify Java driver Add CqlRecordReader to take advantage of native CQL pagination -- Key: CASSANDRA-6311 URL: https://issues.apache.org/jira/browse/CASSANDRA-6311 Project: Cassandra Issue Type: New Feature Components: Hadoop Reporter: Alex Liu Assignee: Alex Liu Fix For: 2.0.7 Attachments: 6311-v10.txt, 6311-v3-2.0-branch.txt, 6311-v4.txt, 6311-v5-2.0-branch.txt, 6311-v6-2.0-branch.txt, 6311-v7.txt, 6311-v8.txt, 6311-v9.txt, 6331-2.0-branch.txt, 6331-v2-2.0-branch.txt Since the latest Cql pagination is done and it should be more efficient, so we need update CqlPagingRecordReader to use it instead of the custom thrift paging. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6311) Add CqlRecordReader to take advantage of native CQL pagination
[ https://issues.apache.org/jira/browse/CASSANDRA-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13946811#comment-13946811 ] Alex Liu commented on CASSANDRA-6311: - It basically creates a new object and set the properties. Even use reflection, we still need create a new object and set all the properties. also if we add new implementation of Row, using reflection will not cover the new implementation class. Add CqlRecordReader to take advantage of native CQL pagination -- Key: CASSANDRA-6311 URL: https://issues.apache.org/jira/browse/CASSANDRA-6311 Project: Cassandra Issue Type: New Feature Components: Hadoop Reporter: Alex Liu Assignee: Alex Liu Fix For: 2.0.7 Attachments: 6311-v10.txt, 6311-v3-2.0-branch.txt, 6311-v4.txt, 6311-v5-2.0-branch.txt, 6311-v6-2.0-branch.txt, 6311-v7.txt, 6311-v8.txt, 6311-v9.txt, 6331-2.0-branch.txt, 6331-v2-2.0-branch.txt Since the latest Cql pagination is done and it should be more efficient, so we need update CqlPagingRecordReader to use it instead of the custom thrift paging. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6311) Add CqlRecordReader to take advantage of native CQL pagination
[ https://issues.apache.org/jira/browse/CASSANDRA-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13947415#comment-13947415 ] Dave Brosius commented on CASSANDRA-6311: - +public boolean next(Long key, Row value) throws IOException +{ +if (nextKeyValue()) +{ +((WrappedRow)value).setRow(getCurrentValue()); +return true; +} +return false; +} This assumes the parameter is a certain type not in evidence by the method signature, which in general is a brittle thing to do. Add CqlRecordReader to take advantage of native CQL pagination -- Key: CASSANDRA-6311 URL: https://issues.apache.org/jira/browse/CASSANDRA-6311 Project: Cassandra Issue Type: New Feature Components: Hadoop Reporter: Alex Liu Assignee: Alex Liu Fix For: 2.0.7 Attachments: 6311-v10.txt, 6311-v11.txt, 6311-v3-2.0-branch.txt, 6311-v4.txt, 6311-v5-2.0-branch.txt, 6311-v6-2.0-branch.txt, 6311-v7.txt, 6311-v8.txt, 6311-v9.txt, 6331-2.0-branch.txt, 6331-v2-2.0-branch.txt Since the latest Cql pagination is done and it should be more efficient, so we need update CqlPagingRecordReader to use it instead of the custom thrift paging. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6311) Add CqlRecordReader to take advantage of native CQL pagination
[ https://issues.apache.org/jira/browse/CASSANDRA-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13944922#comment-13944922 ] Dave Brosius commented on CASSANDRA-6311: - CqlRecordReader.next() doesn't appear to be correct. It assigns values to parameters as if that does something. Add CqlRecordReader to take advantage of native CQL pagination -- Key: CASSANDRA-6311 URL: https://issues.apache.org/jira/browse/CASSANDRA-6311 Project: Cassandra Issue Type: New Feature Components: Hadoop Reporter: Alex Liu Assignee: Alex Liu Fix For: 2.0.7 Attachments: 6311-v10.txt, 6311-v3-2.0-branch.txt, 6311-v4.txt, 6311-v5-2.0-branch.txt, 6311-v6-2.0-branch.txt, 6311-v7.txt, 6311-v8.txt, 6311-v9.txt, 6331-2.0-branch.txt, 6331-v2-2.0-branch.txt Since the latest Cql pagination is done and it should be more efficient, so we need update CqlPagingRecordReader to use it instead of the custom thrift paging. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6311) Add CqlRecordReader to take advantage of native CQL pagination
[ https://issues.apache.org/jira/browse/CASSANDRA-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13944925#comment-13944925 ] Piotr Kołaczkowski commented on CASSANDRA-6311: --- Indeed. Add CqlRecordReader to take advantage of native CQL pagination -- Key: CASSANDRA-6311 URL: https://issues.apache.org/jira/browse/CASSANDRA-6311 Project: Cassandra Issue Type: New Feature Components: Hadoop Reporter: Alex Liu Assignee: Alex Liu Fix For: 2.0.7 Attachments: 6311-v10.txt, 6311-v3-2.0-branch.txt, 6311-v4.txt, 6311-v5-2.0-branch.txt, 6311-v6-2.0-branch.txt, 6311-v7.txt, 6311-v8.txt, 6311-v9.txt, 6331-2.0-branch.txt, 6331-v2-2.0-branch.txt Since the latest Cql pagination is done and it should be more efficient, so we need update CqlPagingRecordReader to use it instead of the custom thrift paging. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6311) Add CqlRecordReader to take advantage of native CQL pagination
[ https://issues.apache.org/jira/browse/CASSANDRA-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945345#comment-13945345 ] Alex Liu commented on CASSANDRA-6311: - Key is Long which is row count number. Value is Row which is backed by ArrayBackedRow, a protected class. We need make it to be a public class. Add CqlRecordReader to take advantage of native CQL pagination -- Key: CASSANDRA-6311 URL: https://issues.apache.org/jira/browse/CASSANDRA-6311 Project: Cassandra Issue Type: New Feature Components: Hadoop Reporter: Alex Liu Assignee: Alex Liu Fix For: 2.0.7 Attachments: 6311-v10.txt, 6311-v3-2.0-branch.txt, 6311-v4.txt, 6311-v5-2.0-branch.txt, 6311-v6-2.0-branch.txt, 6311-v7.txt, 6311-v8.txt, 6311-v9.txt, 6331-2.0-branch.txt, 6331-v2-2.0-branch.txt Since the latest Cql pagination is done and it should be more efficient, so we need update CqlPagingRecordReader to use it instead of the custom thrift paging. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6311) Add CqlRecordReader to take advantage of native CQL pagination
[ https://issues.apache.org/jira/browse/CASSANDRA-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13941898#comment-13941898 ] Alex Liu commented on CASSANDRA-6311: - I will update it to 2.0.1 Add CqlRecordReader to take advantage of native CQL pagination -- Key: CASSANDRA-6311 URL: https://issues.apache.org/jira/browse/CASSANDRA-6311 Project: Cassandra Issue Type: New Feature Components: Hadoop Reporter: Alex Liu Assignee: Alex Liu Fix For: 2.0.7 Attachments: 6311-v3-2.0-branch.txt, 6311-v4.txt, 6311-v5-2.0-branch.txt, 6311-v6-2.0-branch.txt, 6311-v7.txt, 6311-v8.txt, 6311-v9.txt, 6331-2.0-branch.txt, 6331-v2-2.0-branch.txt Since the latest Cql pagination is done and it should be more efficient, so we need update CqlPagingRecordReader to use it instead of the custom thrift paging. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6311) Add CqlRecordReader to take advantage of native CQL pagination
[ https://issues.apache.org/jira/browse/CASSANDRA-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940292#comment-13940292 ] Piotr Kołaczkowski commented on CASSANDRA-6311: --- +1 Add CqlRecordReader to take advantage of native CQL pagination -- Key: CASSANDRA-6311 URL: https://issues.apache.org/jira/browse/CASSANDRA-6311 Project: Cassandra Issue Type: New Feature Components: Hadoop Reporter: Alex Liu Assignee: Alex Liu Fix For: 2.0.7 Attachments: 6311-v3-2.0-branch.txt, 6311-v4.txt, 6311-v5-2.0-branch.txt, 6311-v6-2.0-branch.txt, 6311-v7.txt, 6311-v8.txt, 6311-v9.txt, 6331-2.0-branch.txt, 6331-v2-2.0-branch.txt Since the latest Cql pagination is done and it should be more efficient, so we need update CqlPagingRecordReader to use it instead of the custom thrift paging. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6311) Add CqlRecordReader to take advantage of native CQL pagination
[ https://issues.apache.org/jira/browse/CASSANDRA-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940367#comment-13940367 ] Sylvain Lebresne commented on CASSANDRA-6311: - Tried to commit this but v9 doesn't seem to apply cleanly on the current cassandra-2.0 branch (unless that's meant to be 2.1 only but that's not what the 'fix version' says so...). As a side node, we should update the driver dependency to 2.0.1 instead of 2.0.0-rc2. Add CqlRecordReader to take advantage of native CQL pagination -- Key: CASSANDRA-6311 URL: https://issues.apache.org/jira/browse/CASSANDRA-6311 Project: Cassandra Issue Type: New Feature Components: Hadoop Reporter: Alex Liu Assignee: Alex Liu Fix For: 2.0.7 Attachments: 6311-v3-2.0-branch.txt, 6311-v4.txt, 6311-v5-2.0-branch.txt, 6311-v6-2.0-branch.txt, 6311-v7.txt, 6311-v8.txt, 6311-v9.txt, 6331-2.0-branch.txt, 6331-v2-2.0-branch.txt Since the latest Cql pagination is done and it should be more efficient, so we need update CqlPagingRecordReader to use it instead of the custom thrift paging. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6311) Add CqlRecordReader to take advantage of native CQL pagination
[ https://issues.apache.org/jira/browse/CASSANDRA-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939449#comment-13939449 ] Piotr Kołaczkowski commented on CASSANDRA-6311: --- {noformat} if (origHost != null) +{ +return Collections.singletonList(origHost).iterator(); +} +else +{ +return liveRemoteHosts.iterator(); +} {noformat} This creates a race condition, if the origHost goes down immediately after returning from this method, the list of hosts to try will be empty and the query will fail. In the first branch, you should return origHost *and* liveRemoteHosts. Add CqlRecordReader to take advantage of native CQL pagination -- Key: CASSANDRA-6311 URL: https://issues.apache.org/jira/browse/CASSANDRA-6311 Project: Cassandra Issue Type: New Feature Components: Hadoop Reporter: Alex Liu Assignee: Alex Liu Fix For: 2.0.7 Attachments: 6311-v3-2.0-branch.txt, 6311-v4.txt, 6311-v5-2.0-branch.txt, 6311-v6-2.0-branch.txt, 6311-v7.txt, 6311-v8.txt, 6331-2.0-branch.txt, 6331-v2-2.0-branch.txt Since the latest Cql pagination is done and it should be more efficient, so we need update CqlPagingRecordReader to use it instead of the custom thrift paging. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6311) Add CqlRecordReader to take advantage of native CQL pagination
[ https://issues.apache.org/jira/browse/CASSANDRA-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939476#comment-13939476 ] Alex Liu commented on CASSANDRA-6311: - On the flip side, if race condition happens and we keep returning remote nodes, when the origHost is back up again, the requests will be sent out to other nodes. Java driver internally has retries, so until it's timeout and the node is not back up, then hadoop job fails which reports an alert. V9 version is attached. Add CqlRecordReader to take advantage of native CQL pagination -- Key: CASSANDRA-6311 URL: https://issues.apache.org/jira/browse/CASSANDRA-6311 Project: Cassandra Issue Type: New Feature Components: Hadoop Reporter: Alex Liu Assignee: Alex Liu Fix For: 2.0.7 Attachments: 6311-v3-2.0-branch.txt, 6311-v4.txt, 6311-v5-2.0-branch.txt, 6311-v6-2.0-branch.txt, 6311-v7.txt, 6311-v8.txt, 6311-v9.txt, 6331-2.0-branch.txt, 6331-v2-2.0-branch.txt Since the latest Cql pagination is done and it should be more efficient, so we need update CqlPagingRecordReader to use it instead of the custom thrift paging. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6311) Add CqlRecordReader to take advantage of native CQL pagination
[ https://issues.apache.org/jira/browse/CASSANDRA-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13931573#comment-13931573 ] Piotr Kołaczkowski commented on CASSANDRA-6311: --- Now it looks better, but I don't get why you need this magic constant here: {noformat} if (count 2) {noformat} Why not just create a list of (origHost, liveRemoteHost1, liveRemoteHost2, ..., liveRemoteHostN) and return an iterator to it? To avoid creating a new list just to get an iterator, you can use Iterators.concat: {noformat} return Iterators.concat(Collections.singletonList(origHost).iterator(), liveRemoteHosts.iterator()); {noformat} Add CqlRecordReader to take advantage of native CQL pagination -- Key: CASSANDRA-6311 URL: https://issues.apache.org/jira/browse/CASSANDRA-6311 Project: Cassandra Issue Type: New Feature Components: Hadoop Reporter: Alex Liu Assignee: Alex Liu Fix For: 2.0.7 Attachments: 6311-v3-2.0-branch.txt, 6311-v4.txt, 6311-v5-2.0-branch.txt, 6311-v6-2.0-branch.txt, 6311-v7.txt, 6331-2.0-branch.txt, 6331-v2-2.0-branch.txt Since the latest Cql pagination is done and it should be more efficient, so we need update CqlPagingRecordReader to use it instead of the custom thrift paging. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6311) Add CqlRecordReader to take advantage of native CQL pagination
[ https://issues.apache.org/jira/browse/CASSANDRA-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13922162#comment-13922162 ] Piotr Kołaczkowski commented on CASSANDRA-6311: --- 1. ok, I understand; that was a nice-to-have 3. ok 2: count is defined in the outer scope and is not local to the Iterator instance. Therefore creating two iterators for the same LB policy is going to mess it up: {noformat} @Override +return new AbstractIteratorHost() +{ +protected Host computeNext() +{ +count ++; {noformat} A policy should assign a LOCAL distance to nodes that are susceptible to be returned first by newQueryPlan and it is useless for newQueryPlan to return hosts to which it assigns an IGNORED distance. Now that you may return other (remote) hosts from newQueryPlan, you should not return IGNORED in the distance: {noformat} +@Override +public HostDistance distance(Host host) +{ +if (host.getAddress().getHostName().equals(stickHost)) +return HostDistance.LOCAL; +else +return HostDistance.IGNORED; +} {noformat} Add CqlRecordReader to take advantage of native CQL pagination -- Key: CASSANDRA-6311 URL: https://issues.apache.org/jira/browse/CASSANDRA-6311 Project: Cassandra Issue Type: New Feature Components: Hadoop Reporter: Alex Liu Assignee: Alex Liu Fix For: 2.0.6 Attachments: 6311-v3-2.0-branch.txt, 6311-v4.txt, 6311-v5-2.0-branch.txt, 6311-v6-2.0-branch.txt, 6331-2.0-branch.txt, 6331-v2-2.0-branch.txt Since the latest Cql pagination is done and it should be more efficient, so we need update CqlPagingRecordReader to use it instead of the custom thrift paging. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6311) Add CqlRecordReader to take advantage of native CQL pagination
[ https://issues.apache.org/jira/browse/CASSANDRA-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13922728#comment-13922728 ] Alex Liu commented on CASSANDRA-6311: - Move count inside Iterator for safe play. Change IGNORED to REMOTE. v7 is attached. Add CqlRecordReader to take advantage of native CQL pagination -- Key: CASSANDRA-6311 URL: https://issues.apache.org/jira/browse/CASSANDRA-6311 Project: Cassandra Issue Type: New Feature Components: Hadoop Reporter: Alex Liu Assignee: Alex Liu Fix For: 2.0.6 Attachments: 6311-v3-2.0-branch.txt, 6311-v4.txt, 6311-v5-2.0-branch.txt, 6311-v6-2.0-branch.txt, 6311-v7.txt, 6331-2.0-branch.txt, 6331-v2-2.0-branch.txt Since the latest Cql pagination is done and it should be more efficient, so we need update CqlPagingRecordReader to use it instead of the custom thrift paging. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6311) Add CqlRecordReader to take advantage of native CQL pagination
[ https://issues.apache.org/jira/browse/CASSANDRA-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920786#comment-13920786 ] Piotr Kołaczkowski commented on CASSANDRA-6311: --- org/apache/cassandra/hadoop/cql3/CqlConfigHelper.java:275 {noformat} OptionalSSLOptions ssLOptions = getSSLOptions(conf); {noformat} typo: ssL - ssl -- org/apache/cassandra/hadoop/cql3/CqlConfigHelper.java:398: {noformat} OptionalInteger maxSimultaneousRequests = getInputMinSimultReqPerConnections(conf); OptionalInteger minSimultaneousRequests = getInputMaxSimultReqPerConnections(conf); {noformat} min and max swapped? -- org/apache/cassandra/hadoop/cql3/CqlConfigHelper.java:549: {noformat} OptionalString keystorePassword = getInputNativeSSLTruststorePassword(conf); {noformat} should be: {noformat} OptionalString keystorePassword = getInputNativeSSLKeystorePassword(conf); {noformat} -- org/apache/cassandra/hadoop/cql3/CqlConfigHelper.java:524: {noformat} return new AbstractIteratorHost() { protected Host computeNext() { return origHost; } }; {noformat} Not sure if it was the intent to create an infinite iterator returning nulls or the same host over and over again here. According to the docs, guava iterator implementations *must* invoke endOfData() to terminate iteration. Don't we need here an iterator returning just one item stickHost and let the driver handle the rest? Also, not sure if returning nulls here is allowed at all (the driver docs isn't explicit on that). I guess very likely it is going to NPE if there is a connection problem which might cause confusion. Probably a better solution would be to just return stickHost and let the driver attempt connecting and throwing a meaningful error message upon failure. BTW the implementation of the LoadBalancingPolicy, having two fields origHost and stickHost is redundant and using null on one of those for marking the host is down / unreachable does not convey the intent clearly to me. Can't we just use stickHost and a direct boolean flag for denoting whether it is reachable or not? -- org/apache/cassandra/hadoop/cql3/CqlConfigHelper.java:591: {noformat} private static OptionalString getStringSetting(String parameter, Configuration conf) { String setting = conf.get(parameter); if (setting == null || setting.isEmpty()) return Optional.absent(); return Optional.of(setting); } {noformat} In getStringSetting, setting an empty string is considered an absent option - so it is not possible to have an empty string setting (not sure if it would be useful - just double checking if it was on purpose or by omission) -- {noformat} * 2) where clause must include token(partition_key1 ... partition_keyn) ? and * token(partition_key1 ... partition_keyn) = ? {noformat} Would be nice to have at least some basic validation of the WHERE clause, so the user gets a nice error message when one screws it up. -- org/apache/cassandra/hadoop/cql3/CqlRecordReader.java:230 {noformat} public RowIterator(Configuration conf) {noformat} conf not used -- org/apache/cassandra/hadoop/cql3/CqlRecordReader.java:268 {noformat} return Pair.create(Long.valueOf(keyId), row); {noformat} Boxing is not needed here. Add CqlRecordReader to take advantage of native CQL pagination -- Key: CASSANDRA-6311 URL: https://issues.apache.org/jira/browse/CASSANDRA-6311 Project: Cassandra Issue Type: New Feature Components: Hadoop Reporter: Alex Liu Assignee: Alex Liu Fix For: 2.0.6 Attachments: 6311-v3-2.0-branch.txt, 6311-v4.txt, 6311-v5-2.0-branch.txt, 6331-2.0-branch.txt, 6331-v2-2.0-branch.txt Since the latest Cql pagination is done and it should be more efficient, so we need update CqlPagingRecordReader to use it instead of the custom thrift paging. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6311) Add CqlRecordReader to take advantage of native CQL pagination
[ https://issues.apache.org/jira/browse/CASSANDRA-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13921141#comment-13921141 ] Alex Liu commented on CASSANDRA-6311: - 1. validation of input CQL query needs parsing the query which is what we are trying to avoid. 2. AbstractIterator is to always return to the local host (so that the task is only read data from local host ), it doesn't return endOfData(). It's using stickHost, a host name, to get the Host object which can't be directly created due to the class is not public class. The Host object, origHost, is obtained from cluster internal code. It's possible that origHost object can be null which case the stickHost is not in the cluster. In that case we don't want the job to run for it's in the wrong host. 3. I clean up the code according to other notes. Attach v6 version. Add CqlRecordReader to take advantage of native CQL pagination -- Key: CASSANDRA-6311 URL: https://issues.apache.org/jira/browse/CASSANDRA-6311 Project: Cassandra Issue Type: New Feature Components: Hadoop Reporter: Alex Liu Assignee: Alex Liu Fix For: 2.0.6 Attachments: 6311-v3-2.0-branch.txt, 6311-v4.txt, 6311-v5-2.0-branch.txt, 6331-2.0-branch.txt, 6331-v2-2.0-branch.txt Since the latest Cql pagination is done and it should be more efficient, so we need update CqlPagingRecordReader to use it instead of the custom thrift paging. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6311) Add CqlRecordReader to take advantage of native CQL pagination
[ https://issues.apache.org/jira/browse/CASSANDRA-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847419#comment-13847419 ] Jeremy Hanna commented on CASSANDRA-6311: - [~alexliu68] fwiw, just released version 1.0.5 of the java driver includes support for the LOCAL_ONE in case that's still helpful here. https://groups.google.com/a/lists.datastax.com/forum/#!topic/java-driver-user/UvTLT5q-5o4 Add CqlRecordReader to take advantage of native CQL pagination -- Key: CASSANDRA-6311 URL: https://issues.apache.org/jira/browse/CASSANDRA-6311 Project: Cassandra Issue Type: New Feature Components: Hadoop Reporter: Alex Liu Assignee: Alex Liu Fix For: 2.0.4 Attachments: 6311-v3-2.0-branch.txt, 6311-v4.txt, 6331-2.0-branch.txt, 6331-v2-2.0-branch.txt Since the latest Cql pagination is done and it should be more efficient, so we need update CqlPagingRecordReader to use it instead of the custom thrift paging. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (CASSANDRA-6311) Add CqlRecordReader to take advantage of native CQL pagination
[ https://issues.apache.org/jira/browse/CASSANDRA-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847714#comment-13847714 ] Alex Liu commented on CASSANDRA-6311: - I am updating the patch Add CqlRecordReader to take advantage of native CQL pagination -- Key: CASSANDRA-6311 URL: https://issues.apache.org/jira/browse/CASSANDRA-6311 Project: Cassandra Issue Type: New Feature Components: Hadoop Reporter: Alex Liu Assignee: Alex Liu Fix For: 2.0.4 Attachments: 6311-v3-2.0-branch.txt, 6311-v4.txt, 6331-2.0-branch.txt, 6331-v2-2.0-branch.txt Since the latest Cql pagination is done and it should be more efficient, so we need update CqlPagingRecordReader to use it instead of the custom thrift paging. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (CASSANDRA-6311) Add CqlRecordReader to take advantage of native CQL pagination
[ https://issues.apache.org/jira/browse/CASSANDRA-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13843370#comment-13843370 ] Alex Liu commented on CASSANDRA-6311: - @devP it's applied to Cassandra-2.0 branch Add CqlRecordReader to take advantage of native CQL pagination -- Key: CASSANDRA-6311 URL: https://issues.apache.org/jira/browse/CASSANDRA-6311 Project: Cassandra Issue Type: New Feature Components: Hadoop Reporter: Alex Liu Assignee: Alex Liu Fix For: 2.0.4 Attachments: 6311-v3-2.0-branch.txt, 6311-v4.txt, 6331-2.0-branch.txt, 6331-v2-2.0-branch.txt Since the latest Cql pagination is done and it should be more efficient, so we need update CqlPagingRecordReader to use it instead of the custom thrift paging. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (CASSANDRA-6311) Add CqlRecordReader to take advantage of native CQL pagination
[ https://issues.apache.org/jira/browse/CASSANDRA-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13832811#comment-13832811 ] Jonathan Ellis commented on CASSANDRA-6311: --- Summary of discussion in chat: The user is responsible for providing a valid CQL statement, including token bind variables. The IF API needs to change, probably to {{Long, Row}} where Row is a Java Driver Row (http://www.datastax.com/drivers/java/2.0/apidocs/com/datastax/driver/core/Row.html) and Long is a per-Task row ID. (Precedent: DBInputFormat also uses a Long ID -- https://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapred/lib/db/DBInputFormat.html.) We can either use the metadata from the java driver to continue to estimate progress based on partitions, or switch to estimating progress by CQL row count if there is a way to get that from the server easily. Add CqlRecordReader to take advantage of native CQL pagination -- Key: CASSANDRA-6311 URL: https://issues.apache.org/jira/browse/CASSANDRA-6311 Project: Cassandra Issue Type: New Feature Components: Hadoop Reporter: Alex Liu Assignee: Alex Liu Fix For: 2.0.4 Attachments: 6311-v3-2.0-branch.txt, 6331-2.0-branch.txt, 6331-v2-2.0-branch.txt Since the latest Cql pagination is done and it should be more efficient, so we need update CqlPagingRecordReader to use it instead of the custom thrift paging. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (CASSANDRA-6311) Add CqlRecordReader to take advantage of native CQL pagination
[ https://issues.apache.org/jira/browse/CASSANDRA-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13832835#comment-13832835 ] Alex Liu commented on CASSANDRA-6311: - The expected user defined cql input must have the followings {code} 1) select clause must include partition key columns (to calculate the progress based on the actual CF row processed) 2) where clause must include token(partition_key1 ... partition_keyn) ? and token(partition_key1 ... partition_keyn) = ? (in the right order) {code} Add CqlRecordReader to take advantage of native CQL pagination -- Key: CASSANDRA-6311 URL: https://issues.apache.org/jira/browse/CASSANDRA-6311 Project: Cassandra Issue Type: New Feature Components: Hadoop Reporter: Alex Liu Assignee: Alex Liu Fix For: 2.0.4 Attachments: 6311-v3-2.0-branch.txt, 6331-2.0-branch.txt, 6331-v2-2.0-branch.txt Since the latest Cql pagination is done and it should be more efficient, so we need update CqlPagingRecordReader to use it instead of the custom thrift paging. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (CASSANDRA-6311) Add CqlRecordReader to take advantage of native CQL pagination
[ https://issues.apache.org/jira/browse/CASSANDRA-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13832843#comment-13832843 ] Jonathan Ellis commented on CASSANDRA-6311: --- # Only if we can't estimate row count in CQL rows # Correct Add CqlRecordReader to take advantage of native CQL pagination -- Key: CASSANDRA-6311 URL: https://issues.apache.org/jira/browse/CASSANDRA-6311 Project: Cassandra Issue Type: New Feature Components: Hadoop Reporter: Alex Liu Assignee: Alex Liu Fix For: 2.0.4 Attachments: 6311-v3-2.0-branch.txt, 6331-2.0-branch.txt, 6331-v2-2.0-branch.txt Since the latest Cql pagination is done and it should be more efficient, so we need update CqlPagingRecordReader to use it instead of the custom thrift paging. -- This message was sent by Atlassian JIRA (v6.1#6144)