[ 
https://issues.apache.org/jira/browse/CASSANDRA-7688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309077#comment-14309077
 ] 

mck edited comment on CASSANDRA-7688 at 2/6/15 12:59 PM:
---------------------------------------------------------

{quote}You are quoting the wrong code here, but how do you not background it? 
{quote}

yes i can see that it's not possible at the moment. (i didn't realise that at 
first, but it really wasn't my train of thought either).

{quote}When we add vtable support (cql tables backed by classes, not sstables) 
- then we'll switch sizing (and several other system sstables) to that.{quote}

niceto know. thanks.

{quote}This is a simple temporary replacement for describe_splits_ex, its only 
goal is to free Spark and others from having to maintain an extra Thrift 
connection now. Hence the lack of metrics or configurability of the refresh 
interval.

I'm open to increasing/decreasing the hard-coded one, however, if you have 
better options.{quote}

i have no suggestion.
i'm more concerned/curious as to why "5 minutes"?
 if there's no good answer then isn't metrics important?
 and being able to configure it.

quick examples that come to mind: 
 - what if an installation has lots of jobs built upon each others data and for 
them there's a strong benefit (if not a requirement) for more accurate sizes 
(ie faster schedule rate),
 - what if there's bugs/load caused from this that can be avoided (for an 
installation that doesn't ever use hadoop/spark) by configuring it to zero 
(disabling), giving an immediate option to upgrading-to/waiting-for next 
version.


was (Author: michaelsembwever):
{quote}You are quoting the wrong code here, but how do you not background it? 
{quote}

yes i can see that it's not possible at the moment. (i didn't realise that at 
first, but it really wasn't my train of thought either).

{quote}When we add vtable support (cql tables backed by classes, not sstables) 
- then we'll switch sizing (and several other system sstables) to that.{quote}

niceto know. thanks.

{quote}This is a simple temporary replacement for describe_splits_ex, its only 
goal is to free Spark and others from having to maintain an extra Thrift 
connection now. Hence the lack of metrics or configurability of the refresh 
interval.

I'm open to increasing/decreasing the hard-coded one, however, if you have 
better options.{quote}

i have no suggestion.
i'm more concerned/curious as to why "5 minutes"?
 if there's no good answer then isn't metrics important?
 and being able to configure it.

quick examples that come to mind: 
 - what if an installation has lots of jobs built upon each others data and for 
them there's a strong benefit (if not a requirement) for more accurate sizes 
(ie faster schedule rate),
 - what if there bugs/load caused from this that can be avoided by configuring 
it to zero (disabling), giving an immediate option to upgrading-to/waiting-for 
next version.

> Add data sizing to a system table
> ---------------------------------
>
>                 Key: CASSANDRA-7688
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7688
>             Project: Cassandra
>          Issue Type: New Feature
>            Reporter: Jeremiah Jordan
>            Assignee: Aleksey Yeschenko
>             Fix For: 2.1.3
>
>         Attachments: 7688.txt
>
>
> Currently you can't implement something similar to describe_splits_ex purely 
> from the a native protocol driver.  
> https://datastax-oss.atlassian.net/browse/JAVA-312 is open to expose easily 
> getting ownership information to a client in the java-driver.  But you still 
> need the data sizing part to get splits of a given size.  We should add the 
> sizing information to a system table so that native clients can get to it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to