[jira] [Commented] (CASSANDRA-7622) Implement virtual tables

Benjamin Lerer (JIRA) Tue, 24 Apr 2018 08:02:27 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-7622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16450017#comment-16450017
 ]


Benjamin Lerer commented on CASSANDRA-7622:
-------------------------------------------

{quote}you can do arithmetic operations, searches and aggregations on them but 
the type was wrong{quote}
Sorry, my comment was misleading. I just wanted to mention the fact that 
aggregates and arithmetic operations do not work on {{TEXT}} values.

{quote}What would you think the table schema should look like?{quote}
I asked myself that question a lot. Due to the CQL limitations, I do not think 
that there is a perfect solution. In the end, my preferred  schema is:
{code}
SYSTEM VIEW sv_table_metrics (keyspace TEXT,
                          table TEXT,
                          memtable_on_heap_size BIGINT,
                          memtable_off_heap_size BIGINT,
                          [...]
                          PRIMARY KEY (keyspace, table));
{code}

That approach in the case of the {{table}} and {{keyspace}} metrics result in 
tables with a big number of columns (even if we mitigate that fact by using 
user define types for histograms, meters and timers) but it allow to easily 
select different subset of data. You can query based on {{keyspaces}}, 
{{tables}}, {{metrics}} and {{metric fields}}. At the same time you can easily 
select a specific metric value for a given table in an efficient way.

{quote}I am not fussy about naming. However, using the same terminology does 
confuse users as they may expect the same feature set from Cassandra as they 
got in their relational database. I would personally avoid it.{quote}

Based on my experience working on CQL tickets and my interaction with users or 
discussions with evangelists I came up with 2 conclusions.
# If the feature is the similar to one that they know from the relational 
world, people prefer when you use the same name. It is easier for them to 
recognize it and to understand how it should be used.
# If the feature has a different behavior that what is used in the relational 
world you should be careful and use a different naming or it will backfire.

In this case, there is no real difference between us an the relational world. 
Due to that, I think it would be a mistake to not reuse the name.
The {{Virtual Table}} name is in my opinion the really confusing one. It just 
make me think to some form of pluggable storage. Coming from the SQL world, it 
is not the name I would use in google to figure out how to access system 
information in Cassandra.

{quote}do you have a design or code that you can share? It would be great if 
you can post it. Is there a timeline around when you'll post it?{quote}

At the high level there are some similarities between [~cnlwsu] patch and ours. 
We have introduced some {{ReadQuery}} subclasses that delegate calls to 
{{SystemViews}} and slightly refactored the CQL layer to allow it to work on 
top of all {{ReadQuery}} implementations. The advantage of that approach is 
that the existing CQL functionalities are automatically supported on top the 
{{SystemViews}} and the conditional logic require for adding support for 
{{SystemViews}} is much lower.

[~cnlwsu] current patch does not support some range queries or multi-partition 
queries for example. It will fire an {{[Invalid query] message="IN restrictions 
are not supported on indexed columns"}} for multi-partition queries. We avoided 
that kind of risks/problems with our approach.

That reduced the logic of our {{SystemView}} implementation to just fetching 
the requesting data or updating them.

Our current code has been designed for DSE. So I need to modify it to make it 
work on top of C*. As I am also quite busy with some other tasks, it would 
probably take 2 weeks before I finish the port.     

  
    



> Implement virtual tables
> ------------------------
>
>                 Key: CASSANDRA-7622
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7622
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: CQL
>            Reporter: Tupshin Harper
>            Assignee: Chris Lohfink
>            Priority: Major
>             Fix For: 4.x
>
>
> There are a variety of reasons to want virtual tables, which would be any 
> table that would be backed by an API, rather than data explicitly managed and 
> stored as sstables.
> One possible use case would be to expose JMX data through CQL as a 
> resurrection of CASSANDRA-3527.
> Another is a more general framework to implement the ability to expose yaml 
> configuration information. So it would be an alternate approach to 
> CASSANDRA-7370.
> A possible implementation would be in terms of CASSANDRA-7443, but I am not 
> presupposing.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-7622) Implement virtual tables

Reply via email to