[jira] [Commented] (CASSANDRA-12367) Add an API to request the size of a CQL partition

2016-09-19 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15503192#comment-15503192
 ] 

Sylvain Lebresne commented on CASSANDRA-12367:
--

Updated patch looks good, but we should have some basic tests for this before 
committing.

bq. I don't feel strongly either way since I also agree that both options have 
merit. I've left the check in for now but I have no objection to removing it if 
others feel strongly.

Not really feeling strongly either. Ok to leave it as it for now.


> Add an API to request the size of a CQL partition
> -
>
> Key: CASSANDRA-12367
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12367
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Minor
> Fix For: 3.x
>
> Attachments: 12367-trunk-v2.txt, 12367-trunk.txt
>
>
> It would be useful to have an API that we could use to get the total 
> serialized size of a CQL partition, scoped by keyspace and table, on disk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12367) Add an API to request the size of a CQL partition

2016-09-16 Thread sankalp kohli (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15497594#comment-15497594
 ] 

sankalp kohli commented on CASSANDRA-12367:
---

I think we should return like -1 if key is not replicated to the box and not 0. 
The reason is that 0 should mean the key is not there in that instance and -1 
will tell you that you are not calling the correct instances. 

> Add an API to request the size of a CQL partition
> -
>
> Key: CASSANDRA-12367
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12367
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Minor
> Fix For: 3.x
>
> Attachments: 12367-trunk-v2.txt, 12367-trunk.txt
>
>
> It would be useful to have an API that we could use to get the total 
> serialized size of a CQL partition, scoped by keyspace and table, on disk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12367) Add an API to request the size of a CQL partition

2016-09-15 Thread Geoffrey Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15495387#comment-15495387
 ] 

Geoffrey Yu commented on CASSANDRA-12367:
-

Thanks for the first pass [~slebresne]! I added another commit to address your 
comments 
[here|https://github.com/geoffxy/cassandra/commit/a71968ebba8b67591b88cafd2daf3b37e17fec52].
 I added {{rowCount()}} to the {{Partition}} interface to be able to pass in a 
{{rowEstimate}} to {{UnfilteredRowIteratorSerializer.serializedSize()}} since 
all the implementing classes already had that method available. Please let me 
know how it looks now!

{quote}
Wonders if it wouldn't be more user friendly to return 0 if the key is not 
hosted on that replica (which will simply happen if we don't check anything). 
Genuine question though, I could see both options having advantages, so 
mentioning it for the sake of discussion.
{quote}

I don't feel strongly either way since I also agree that both options have 
merit. I've left the check in for now but I have no objection to removing it if 
others feel strongly.

> Add an API to request the size of a CQL partition
> -
>
> Key: CASSANDRA-12367
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12367
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Minor
> Fix For: 3.x
>
> Attachments: 12367-trunk-v2.txt, 12367-trunk.txt
>
>
> It would be useful to have an API that we could use to get the total 
> serialized size of a CQL partition, scoped by keyspace and table, on disk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12367) Add an API to request the size of a CQL partition

2016-09-13 Thread sankalp kohli (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15488541#comment-15488541
 ] 

sankalp kohli commented on CASSANDRA-12367:
---

This JIRA we created is for getting the size on disk for a CQL partition. You 
might want to create a separate JIRA for SIZE ON feature.  

> Add an API to request the size of a CQL partition
> -
>
> Key: CASSANDRA-12367
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12367
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Minor
> Fix For: 3.x
>
> Attachments: 12367-trunk-v2.txt, 12367-trunk.txt
>
>
> It would be useful to have an API that we could use to get the total 
> serialized size of a CQL partition, scoped by keyspace and table, on disk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12367) Add an API to request the size of a CQL partition

2016-09-13 Thread Russell Bradberry (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15488516#comment-15488516
 ] 

Russell Bradberry commented on CASSANDRA-12367:
---

{quote}
Also by SIZE ON, will it return the size of data the query is returning or size 
on disk?
{quote}

would probably make the most sense as the size of data returned from the query. 
 Size on disk could mean many things, eg. compression etc.

> Add an API to request the size of a CQL partition
> -
>
> Key: CASSANDRA-12367
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12367
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Minor
> Fix For: 3.x
>
> Attachments: 12367-trunk-v2.txt, 12367-trunk.txt
>
>
> It would be useful to have an API that we could use to get the total 
> serialized size of a CQL partition, scoped by keyspace and table, on disk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12367) Add an API to request the size of a CQL partition

2016-09-13 Thread sankalp kohli (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15488499#comment-15488499
 ] 

sankalp kohli commented on CASSANDRA-12367:
---

The reason you need to make this call before write is because you don't want to 
make the partition too big. 


"In this case it would be the size of the query, if you want the size of a 
given partition then you would run a query specifying only the partition key."

If we run a query specifying only partition key, it will read gigs of data and 
will probability timeout. So won't work. We want a cheap way to know the size 
of CQL partition. 


Also by SIZE ON, will it return the size of data the query is returning or size 
on disk? 


> Add an API to request the size of a CQL partition
> -
>
> Key: CASSANDRA-12367
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12367
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Minor
> Fix For: 3.x
>
> Attachments: 12367-trunk-v2.txt, 12367-trunk.txt
>
>
> It would be useful to have an API that we could use to get the total 
> serialized size of a CQL partition, scoped by keyspace and table, on disk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12367) Add an API to request the size of a CQL partition

2016-09-13 Thread Russell Bradberry (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15487263#comment-15487263
 ] 

Russell Bradberry commented on CASSANDRA-12367:
---

{quote}
I am not sure how it will work like tracing with SIZE ON? When you issue a 
query after SIZE ON, will it give the size of the query or CQL partition? 
{quote}
In this case it would be the size of the query, if you want the size of a given 
partition then you would run a query specifying only the partition key.

{quote}
Also we will need the size before every read or write. This will cause calling 
SIZE ON and then OFF after every operation.
{quote}

Why?  I was suggesting this for the CQL specific representation, the internal 
representation could still be a JMX call.  If the client needs it for every 
read/write then it would just always be on, just as if you wanted to have the 
trace information for every read/write.  

> Add an API to request the size of a CQL partition
> -
>
> Key: CASSANDRA-12367
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12367
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Minor
> Fix For: 3.x
>
> Attachments: 12367-trunk-v2.txt, 12367-trunk.txt
>
>
> It would be useful to have an API that we could use to get the total 
> serialized size of a CQL partition, scoped by keyspace and table, on disk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12367) Add an API to request the size of a CQL partition

2016-09-13 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15486892#comment-15486892
 ] 

Sylvain Lebresne commented on CASSANDRA-12367:
--

bq. Are these changes similar to what you had in mind?

Yes, that's what I had in mind, thanks. A few remarks from eye-balling it:
* You can get the uncompressed length of a {{SSTableReader}} with the 
{{uncompressedLength()}} method, no need to open a scanner.
* You can get the sstables for a given table using 
{{ColumnFamilyStore#getLiveSSTables()}} (or 
{{ColumnFamilyStore#getSSTables(SSTableSet.CANONICAL)}} if you really want the 
canonical set, though that probably doesn't matter much here) rather than 
iterating over all sstables of the keyspace.
* Would be more consistent to reuse {{StorageService#getValidColumnFamilies()}} 
rather than re-inventing your own checking (namely 
{{validateKeyspaceTableCombination}}).
* Regarding the memtable, it makes sense to have the option to include it, but 
I think we should be consistent in what we sum. For sstables, what we use is 
the serialized size of the partition, so I think we should do the same for 
memtables, that is call 
{{UnfilteredRowIteratorSerializer.serializedSize(partition.unfilteredIterator())}}.
* Wonders if it wouldn't be more user friendly to return 0 if the key is not 
hosted on that replica (which will simply happen if we don't check anything). 
Genuine question though, I could see both options having advantages, so 
mentioning it for the sake of discussion.
* I'd maybe call the JMX call {{getSerializedPartitionSize}} (or even 
{{getSerializedPartitionSizeInBytes}}) so it's a bit more explicit.


> Add an API to request the size of a CQL partition
> -
>
> Key: CASSANDRA-12367
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12367
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Minor
> Fix For: 3.x
>
> Attachments: 12367-trunk-v2.txt, 12367-trunk.txt
>
>
> It would be useful to have an API that we could use to get the total 
> serialized size of a CQL partition, scoped by keyspace and table, on disk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12367) Add an API to request the size of a CQL partition

2016-09-12 Thread sankalp kohli (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15485830#comment-15485830
 ] 

sankalp kohli commented on CASSANDRA-12367:
---

I am not sure how it will work like tracing with SIZE ON? When you issue a 
query after SIZE ON, will it give the size of the query or CQL partition? 
Also we will need the size before every read or write. This will cause calling 
SIZE ON and then OFF after every operation.  

> Add an API to request the size of a CQL partition
> -
>
> Key: CASSANDRA-12367
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12367
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Minor
> Fix For: 3.x
>
> Attachments: 12367-trunk-v2.txt, 12367-trunk.txt
>
>
> It would be useful to have an API that we could use to get the total 
> serialized size of a CQL partition, scoped by keyspace and table, on disk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12367) Add an API to request the size of a CQL partition

2016-09-12 Thread Russell Bradberry (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15484582#comment-15484582
 ] 

Russell Bradberry commented on CASSANDRA-12367:
---

I agree with [~thobbs] that it doesn't really belong in CQL directly.  The 
writeTime and ttl meta information in CQL is at the column level and makes 
sense.  What about exposing it in the same way that TRACING is exposed?  where 
setting something like "SIZES ON" will modify the output and can be implemented 
in the clients in a similar fashion

> Add an API to request the size of a CQL partition
> -
>
> Key: CASSANDRA-12367
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12367
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Minor
> Fix For: 3.x
>
> Attachments: 12367-trunk-v2.txt, 12367-trunk.txt
>
>
> It would be useful to have an API that we could use to get the total 
> serialized size of a CQL partition, scoped by keyspace and table, on disk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12367) Add an API to request the size of a CQL partition

2016-09-03 Thread Geoffrey Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15461776#comment-15461776
 ] 

Geoffrey Yu commented on CASSANDRA-12367:
-

[~slebresne]: Are [these 
changes|https://github.com/geoffxy/cassandra/compare/trunk...geoffxy:CASSANDRA-12367?w=1]
 similar to what you had in mind? It is meant to subtract the offsets between 
{{RowIndexedEntry}} objects corresponding to the partition key and the next 
partition key in the file, to get a size in bytes.

I also kept the code that reads the partition from the memtable so that it 
would be possible for the operator to get information on the partition's 
footprint in the memtable as well. However, it also ignores {{Unfiltered}} 
objects that are not {{Row}} s.

> Add an API to request the size of a CQL partition
> -
>
> Key: CASSANDRA-12367
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12367
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Minor
> Fix For: 3.x
>
> Attachments: 12367-trunk-v2.txt, 12367-trunk.txt
>
>
> It would be useful to have an API that we could use to get the total 
> serialized size of a CQL partition, scoped by keyspace and table, on disk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12367) Add an API to request the size of a CQL partition

2016-08-30 Thread sankalp kohli (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15449767#comment-15449767
 ] 

sankalp kohli commented on CASSANDRA-12367:
---

Lets implement this in JMX and create another JIRA to do it with virtual tables 
then. I still think it is similar to write time as it also returns internal 
state of the DB even if this is not the CQL path. 

> Add an API to request the size of a CQL partition
> -
>
> Key: CASSANDRA-12367
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12367
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Minor
> Fix For: 3.x
>
> Attachments: 12367-trunk-v2.txt, 12367-trunk.txt
>
>
> It would be useful to have an API that we could use to get the total 
> serialized size of a CQL partition, scoped by keyspace and table, on disk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12367) Add an API to request the size of a CQL partition

2016-08-30 Thread sankalp kohli (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15449755#comment-15449755
 ] 

sankalp kohli commented on CASSANDRA-12367:
---

ok lets do JMX for this. 

> Add an API to request the size of a CQL partition
> -
>
> Key: CASSANDRA-12367
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12367
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Minor
> Fix For: 3.x
>
> Attachments: 12367-trunk-v2.txt, 12367-trunk.txt
>
>
> It would be useful to have an API that we could use to get the total 
> serialized size of a CQL partition, scoped by keyspace and table, on disk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12367) Add an API to request the size of a CQL partition

2016-08-30 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15449351#comment-15449351
 ] 

Sylvain Lebresne commented on CASSANDRA-12367:
--

bq. We need this feature now so we can do it with CQL and then when virtual 
tables are implemented, move this feature there. What do you think Sylvain 
Lebresne

With all due respect, that's not really an argument. I'm staying on my position 
that the proposed CQL mechanism is imo weird, unintuitive and ad-hoc (from a 
CQL standpoint) and I really think we shouldn't do it that way. I get that 
"you" want it, but we have to think of the good of the software in general, and 
stuffs are much more easy to add than remove, so adding something "ugly" now to 
change it later don't really work. I'd be ok with focusing on JMX only for this 
ticket and creating a new one to do it well in CQL, which again probably means 
using this has initial motivation for introducing the virtual table mechanism 
we've been talking about, and I'm even happy helping with that as I think 1) 
this would be more generally useful anyway and 2) I'm not sure it's *that* hard 
in practice.

> Add an API to request the size of a CQL partition
> -
>
> Key: CASSANDRA-12367
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12367
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Minor
> Fix For: 3.x
>
> Attachments: 12367-trunk-v2.txt, 12367-trunk.txt
>
>
> It would be useful to have an API that we could use to get the total 
> serialized size of a CQL partition, scoped by keyspace and table, on disk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12367) Add an API to request the size of a CQL partition

2016-08-29 Thread sankalp kohli (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15446980#comment-15446980
 ] 

sankalp kohli commented on CASSANDRA-12367:
---

We need this feature now so we can do it with CQL and then when virtual tables 
are implemented, move this feature there. What do you think [~slebresne]
JMX is not an option since clients need parallel effort to do connection 
pooling, etc to use this. Also JMX is not very good in performance as we have 
seen with perf testing for high volume calls. 

 

> Add an API to request the size of a CQL partition
> -
>
> Key: CASSANDRA-12367
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12367
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Minor
> Fix For: 3.x
>
> Attachments: 12367-trunk-v2.txt, 12367-trunk.txt
>
>
> It would be useful to have an API that we could use to get the total 
> serialized size of a CQL partition, scoped by keyspace and table, on disk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12367) Add an API to request the size of a CQL partition

2016-08-29 Thread Jon Haddad (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15446458#comment-15446458
 ] 

Jon Haddad commented on CASSANDRA-12367:


If you're going to include it as a CQL option, I'd like to suggest making it a 
function size() rather than a special keyword.

> Add an API to request the size of a CQL partition
> -
>
> Key: CASSANDRA-12367
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12367
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Minor
> Fix For: 3.x
>
> Attachments: 12367-trunk-v2.txt, 12367-trunk.txt
>
>
> It would be useful to have an API that we could use to get the total 
> serialized size of a CQL partition, scoped by keyspace and table, on disk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12367) Add an API to request the size of a CQL partition

2016-08-29 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15445841#comment-15445841
 ] 

Sylvain Lebresne commented on CASSANDRA-12367:
--

I'm not entirely convinced by the way this is implemented because:
# it iterates over every row which sounds pretty wasteful, especially if the 
goal is to have a cheap way to determine how big a partition is on disk (though 
the description of the ticket could use a bit more in term of motivation, so 
I'm mainly guessing that's the intended use case).
# it uses {{Row#dataSize()}} which only return the size of data contained in 
the row, but ignoring all the artifact of the serialization. It also ignores 
range tombstones. This overall mean the return number doesn't really represent 
the size on disk, and what it represent is a big ad-hoc currently imo.

What I'd suggest is instead to use the index file, and return the actual size 
of the data on disk (by simply subtracting the offset of the start and end of 
the partition in the sstable). This would be *a lot* faster and imo more 
meaningful (the only caveat being that it's still not the size on disk since it 
ignores compression, but that's probably kind of ok).

Regarding exposing that in CQL however, I'm pretty much -1 on the syntax 
suggested. I agree with Tyler, this is way too weird to make such a special 
case in CQL. This is very different from the {{ttl()}} and {{writetime()}} 
method for instance, in that those just return data that are part of CQL. This 
metrics here imply a completely different path (since it's intrinsically a 
local query) and result set, which means it'd be almost cleaner to have a full 
different statement, like {{GET_PARTITION_SIZE FROM foo WHERE ...}} instead of 
reusing {{SELECT}}. I'm *not* suggesting we add that too since imo it's way too 
ad-hoc to justified the addition.

Don't get me wrong, I think this could be exposed much more elegantly once we 
have virtual tables and I'll be happy to do so when we have them. And yes, 
virtual tables will probably take a bit more time to come, but we'll have the 
JMX call in the meantime.


> Add an API to request the size of a CQL partition
> -
>
> Key: CASSANDRA-12367
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12367
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Minor
> Fix For: 3.x
>
> Attachments: 12367-trunk-v2.txt, 12367-trunk.txt
>
>
> It would be useful to have an API that we could use to get the total 
> serialized size of a CQL partition, scoped by keyspace and table, on disk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12367) Add an API to request the size of a CQL partition

2016-08-29 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15445156#comment-15445156
 ] 

Marcus Eriksson commented on CASSANDRA-12367:
-

[~geoffxy] I *think* we could do something like this:
{code}
DataRange keyRange = DataRange.forKeyRange(new 
Range<>(key.getToken().minKeyBound(), key.getToken().maxKeyBound()));
sstable.getScanner(ColumnFilter.all(store.metadata), keyRange, false);
{code}

> Add an API to request the size of a CQL partition
> -
>
> Key: CASSANDRA-12367
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12367
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Minor
> Fix For: 3.x
>
> Attachments: 12367-trunk-v2.txt, 12367-trunk.txt
>
>
> It would be useful to have an API that we could use to get the total 
> serialized size of a CQL partition, scoped by keyspace and table, on disk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12367) Add an API to request the size of a CQL partition

2016-08-29 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15445093#comment-15445093
 ] 

Marcus Eriksson commented on CASSANDRA-12367:
-

We already expose some metadata using CQL (writetime(..), ttl(..)) so it 
wouldn't be a total special case, even though the syntax looks a bit weird (but 
I can't think of a better one)

> Add an API to request the size of a CQL partition
> -
>
> Key: CASSANDRA-12367
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12367
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Minor
> Fix For: 3.x
>
> Attachments: 12367-trunk-v2.txt, 12367-trunk.txt
>
>
> It would be useful to have an API that we could use to get the total 
> serialized size of a CQL partition, scoped by keyspace and table, on disk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12367) Add an API to request the size of a CQL partition

2016-08-25 Thread sankalp kohli (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15438264#comment-15438264
 ] 

sankalp kohli commented on CASSANDRA-12367:
---

As per discussion with [~thobbs] we can do some sort of virtual tables for this 
to expose this. But i think that will be a longer project. Can we do this here 
and later once we have that feature move this call. 

> Add an API to request the size of a CQL partition
> -
>
> Key: CASSANDRA-12367
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12367
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Minor
> Fix For: 3.x
>
> Attachments: 12367-trunk-v2.txt, 12367-trunk.txt
>
>
> It would be useful to have an API that we could use to get the total 
> serialized size of a CQL partition, scoped by keyspace and table, on disk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12367) Add an API to request the size of a CQL partition

2016-08-25 Thread sankalp kohli (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15437526#comment-15437526
 ] 

sankalp kohli commented on CASSANDRA-12367:
---

I agree JMX will be simpler however it will be too much effort from the client 
teams to do this via JMX. Different teams need to implement this in there stack 
and will be hard to use. Also they need to set timeouts, connection pooling for 
making these calls which already happens in Java driver. 
Due to these and creating a parallel process to get this information, I think 
we should do it over CQL.  

> Add an API to request the size of a CQL partition
> -
>
> Key: CASSANDRA-12367
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12367
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Minor
> Fix For: 3.x
>
> Attachments: 12367-trunk.txt
>
>
> It would be useful to have an API that we could use to get the total 
> serialized size of a CQL partition, scoped by keyspace and table, on disk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12367) Add an API to request the size of a CQL partition

2016-08-25 Thread Tyler Hobbs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15437495#comment-15437495
 ] 

Tyler Hobbs commented on CASSANDRA-12367:
-

bq. As per NGCC talk of Patrick..we are opening up CQL to query C* metrics. 

The discussion at NGCC was about exposing virtual tables that contain metrics, 
not necessarily modifying the query language to support metrics directly.

bq. Also if we expose it with JMX...how will apps make the call for which this 
is useful. They need to know which replica the key maps to and then call the 
JMX. Also we dont want to expose JMX auth to clients to call at will. So I dont 
see any other way besides CQL to expose this to clients.

The drivers have tools for determining the replicas for a partition key.  As 
for exposing JMX to clients, you could use something like mx4j or jolokia in 
front of Cassandra instead to present a limited interface.

> Add an API to request the size of a CQL partition
> -
>
> Key: CASSANDRA-12367
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12367
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Minor
> Fix For: 3.x
>
> Attachments: 12367-trunk.txt
>
>
> It would be useful to have an API that we could use to get the total 
> serialized size of a CQL partition, scoped by keyspace and table, on disk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12367) Add an API to request the size of a CQL partition

2016-08-25 Thread sankalp kohli (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15437481#comment-15437481
 ] 

sankalp kohli commented on CASSANDRA-12367:
---

As per NGCC talk of Patrick..we are opening up CQL to query C* metrics. 

Also if we expose it with JMX...how will apps make the call for which this is 
useful. They need to know which replica the key maps to and then call the JMX. 
Also we dont want to expose JMX auth to clients to call at will. So I dont see 
any other way besides CQL to expose this to clients. 

> Add an API to request the size of a CQL partition
> -
>
> Key: CASSANDRA-12367
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12367
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Minor
> Fix For: 3.x
>
> Attachments: 12367-trunk.txt
>
>
> It would be useful to have an API that we could use to get the total 
> serialized size of a CQL partition, scoped by keyspace and table, on disk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12367) Add an API to request the size of a CQL partition

2016-08-25 Thread sankalp kohli (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15437480#comment-15437480
 ] 

sankalp kohli commented on CASSANDRA-12367:
---

As per NGCC talk of Patrick..we are opening up CQL to query C* metrics. 

Also if we expose it with JMX...how will apps make the call for which this is 
useful. They need to know which replica the key maps to and then call the JMX. 
Also we dont want to expose JMX auth to clients to call at will. So I dont see 
any other way besides CQL to expose this to clients. 

> Add an API to request the size of a CQL partition
> -
>
> Key: CASSANDRA-12367
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12367
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Minor
> Fix For: 3.x
>
> Attachments: 12367-trunk.txt
>
>
> It would be useful to have an API that we could use to get the total 
> serialized size of a CQL partition, scoped by keyspace and table, on disk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12367) Add an API to request the size of a CQL partition

2016-08-25 Thread Tyler Hobbs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15437450#comment-15437450
 ] 

Tyler Hobbs commented on CASSANDRA-12367:
-

Doing this as a special case in CQL feels wrong to me.  The query language is 
really designed for querying data in the database, not metadata about the 
storage layer.  I'd prefer to stick with JMX for this.

> Add an API to request the size of a CQL partition
> -
>
> Key: CASSANDRA-12367
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12367
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Minor
> Fix For: 3.x
>
> Attachments: 12367-trunk.txt
>
>
> It would be useful to have an API that we could use to get the total 
> serialized size of a CQL partition, scoped by keyspace and table, on disk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12367) Add an API to request the size of a CQL partition

2016-08-24 Thread sankalp kohli (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15435591#comment-15435591
 ] 

sankalp kohli commented on CASSANDRA-12367:
---

We should actually expose a CQL call to read this value from the replicas and 
return back all results. Example:

Select SIZE from test where a =10; //a is CQL partition 
Make this query at consistency QUORUM with RF=3
EndpointSize
10.0.0.1  987987
10.0.0.2  7897

cc [~krummas] What do you think? 

> Add an API to request the size of a CQL partition
> -
>
> Key: CASSANDRA-12367
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12367
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Minor
> Fix For: 3.x
>
> Attachments: 12367-trunk.txt
>
>
> It would be useful to have an API that we could use to get the total 
> serialized size of a CQL partition, scoped by keyspace and table, on disk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12367) Add an API to request the size of a CQL partition

2016-08-23 Thread sankalp kohli (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15433389#comment-15433389
 ] 

sankalp kohli commented on CASSANDRA-12367:
---

Also what if we expose this through CQL as well. There are clients who want to 
know how big the CQL partition is. So something like 
select bytes from table where 

What do you think. This call will be lot cheaper than counting the CQL rows and 
finding out how big the partition is. 

> Add an API to request the size of a CQL partition
> -
>
> Key: CASSANDRA-12367
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12367
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Minor
> Fix For: 3.x
>
> Attachments: 12367-trunk.txt
>
>
> It would be useful to have an API that we could use to get the total 
> serialized size of a CQL partition, scoped by keyspace and table, on disk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12367) Add an API to request the size of a CQL partition

2016-08-23 Thread sankalp kohli (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15433370#comment-15433370
 ] 

sankalp kohli commented on CASSANDRA-12367:
---

I dont think we should output more information as it will make this call 
expensive. SO we should stick to size for this JMX call. We can always add more 
JMX calls for the things you are suggesting. 

> Add an API to request the size of a CQL partition
> -
>
> Key: CASSANDRA-12367
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12367
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Minor
> Fix For: 3.x
>
> Attachments: 12367-trunk.txt
>
>
> It would be useful to have an API that we could use to get the total 
> serialized size of a CQL partition, scoped by keyspace and table, on disk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12367) Add an API to request the size of a CQL partition

2016-08-23 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15432280#comment-15432280
 ] 

Marcus Eriksson commented on CASSANDRA-12367:
-

Could we use {{SSTableReader.getScanner(Range range, ...)}} instead of 
scanning all the partitions in the sstable? We would need to create the range 
so that it includes the token requested but I think it should save us some time 
by seeking to the correct position directly.

> Add an API to request the size of a CQL partition
> -
>
> Key: CASSANDRA-12367
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12367
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Minor
> Fix For: 3.x
>
> Attachments: 12367-trunk.txt
>
>
> It would be useful to have an API that we could use to get the total 
> serialized size of a CQL partition, scoped by keyspace and table, on disk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)