[ 
https://issues.apache.org/jira/browse/CASSANDRA-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13088738#comment-13088738
 ] 

T Jake Luciani edited comment on CASSANDRA-2474 at 8/22/11 2:49 PM:
--------------------------------------------------------------------

Rather than trying to support all this flexibility on the fly in Hive, why 
don't we let users create transposed projections in the DML?

I think this is reasonable vs CQL language because CQL is the source of truth 
for the DML where-as Hive refers to the meta info explicitly.

Example for timeseries supercol wide rows:

{code}
CREATE EXTERNAL TABLE timeline(user_id string, tweet_id long, username string, 
timestamp long)
      STORED BY 'org.apache.hadoop.hive.cassandra.CassandraStorageHandler'
      WITH SERDEPROPERTIES ( "cassandra.columns.mapping" = 
":key,:supercol,:subcol:uname,:subcol:ts")    
{code}

:key, :supercol, :subcol are reserved keywords

Example for composite/supercolumn level:

{code}
CREATE EXTERNAL TABLE retweets(user_id string, tweet_id long, retweet_id long)  
      STORED BY 'org.apache.hadoop.hive.cassandra.CassandraStorageHandler'
      WITH SERDEPROPERTIES ( "cassandra.columns.mapping" = 
":key,retweet:component1,retweet:component2)
{code}

This means you might have to create many projections on the source data in Hive 
to support your analysis but this feels more OLAP than the CQL transpose on the 
fly approach.

Also, we could still support the transpose() function syntax using a 
passthrough UTF so the same syntax works across CQL and HiveQL


      was (Author: tjake):
    Rather than trying to support all this flexibility on the fly in Hive, why 
don't we let users create transposed projections in the DML?

I think this is reasonable vs CQL language because CQL is the source of truth 
for the DML where-as Hive refers to the meta info explicitly.

Example for timeseries supercol wide rows:

{code}
CREATE EXTERNAL TABLE timeline(user_id string, tweet_id long, username string, 
timestamp long)
      STORED BY 'org.apache.hadoop.hive.cassandra.CassandraStorageHandler'
      WITH SERDEPROPERTIES ( "cassandra.columns.mapping" = 
":key,:supercol,:subcol:uname,:subcol:ts")    
{code}

:key, :supercol, :subcol are reserved keywords

Example for composite/supercolumn level:

{code}
CREATE EXTERNAL TABLE retweets(user_id string, tweet_id long, retweet_id long)  
      STORED BY 'org.apache.hadoop.hive.cassandra.CassandraStorageHandler'
      WITH SERDEPROPERTIES ( "cassandra.columns.mapping" = 
":key,:component1,:component2)
{code}

This means you might have to create many projections on the source data in Hive 
to support your analysis but this feels more OLAP than the CQL transpose on the 
fly approach.

Also, we could still support the transpose() function syntax using a 
passthrough UTF so the same syntax works across CQL and HiveQL

  
> CQL support for compound columns
> --------------------------------
>
>                 Key: CASSANDRA-2474
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2474
>             Project: Cassandra
>          Issue Type: Sub-task
>          Components: API, Core
>            Reporter: Eric Evans
>              Labels: cql
>             Fix For: 1.0
>
>
> For the most part, this boils down to supporting the specification of 
> compound column names (the CQL syntax is colon-delimted terms), and then 
> teaching the decoders (drivers) to create structures from the results.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to