Re: Migrating to CQL and Non Compact Storage

Anuj Wadehra Mon, 11 Apr 2016 10:19:57 -0700

Thanks Jack.Let me rephrase and try to fetch some help :)
Cql tables always have schema but Thrift allowed you to have a cf with mix of 
statically declared columns in schema and dynamic columns i.e. columns not part 
of schema and created as and when needed at runtime. When you drop Thrift code 
and use CQL to access Thrift written data, data in statically declared columns 
is readable but If you want to read data in dynamic columns via CQL, you need 
to drop all columns via cassandra cli by executing following command:
update column family user_profiles
      with key_validation_class = UTF8Type
       and comparator = UTF8Type
       and column_metadata=[] I would suggest to read the section "Mixing 
static and dynamic" athttp://www.datastax.com/dev/blog/thrift-to-cql3 for 
understanding the above approach.


Another approach is to create a new table in CQL with collections for dynamic 
data and some statically declared columns as in old cf. Here are the 
challenges: 
1. Performance of Spark / Range scan queries on new table with Non compact 
storage is worse as compared to Thrift created compact storage table. This is 
due to more IO with non compact storage. This thing is addressed in 3.x but we 
are not comfortable moving to 3.x in production.

2. Data migration from Thrift tabke to new table takes hours. This is 
problematic in production systems.

ThanksAnuj
Sent from Yahoo Mail on Android 
 
  On Mon, 11 Apr, 2016 at 8:59 PM, Jack Krupansky<jack.krupan...@gmail.com> 
wrote:   Sorry, but your message is too confusing - you say "reading dynamic 
columns in CQL" and "make the table schema less", but neither has any relevance 
to CQL! 1. CQL tables always have schemas. 2. All columns in CQL are statically 
declared (even maps/collections are statically declared columns.) Granted, it 
is a challenge for Thrift users to get used to the terminology of CQL, but it 
is required. If necessary, review some of the free online training videos for 
data modeling.
Unless your data model is very simply and does directly translate into CQL, you 
probably do need to bite the bullet and re-model your data to exploit the 
features of CQL rather than fight CQL trying to mimic Thrift per se.
In any case, take another shot at framing the problem and then maybe people 
here can help you out.
-- Jack Krupansky
On Mon, Apr 11, 2016 at 10:39 AM, Anuj Wadehra <anujw_2...@yahoo.co.in> wrote:

Any comments or suggestions on this one? 
ThanksAnuj

Sent from Yahoo Mail on Android 
 
 On Sun, 10 Apr, 2016 at 11:39 PM, Anuj Wadehra<anujw_2...@yahoo.co.in> wrote:  
 Hi
We are on 2.0.14 and Thrift. We are planning to migrate to CQL soon but facing 
some challenges.
We have a cf with a mix of statically defined columns and dynamic columns 
(created at run time). For reading dynamic columns in CQL, we have two options:
1. Drop all columns and make the table schema less. This way, we will get a Cql 
row for each column defined for a row key--As mentioned here: 
http://www.datastax.com/dev/blog/thrift-to-cql3
2.Migrate entire data to a new non compact storage table and create collections 
for dynamic columns in new table.
In our case, we have observed that approach 2 causes 3 times slower performance 
in Range scan queries used by Spark. This is not acceptable. Cassandra 3 has 
optimized storage engine but we are not comfortable moving to 3.x in production.
Moreover, data migration to new table using Spark takes hours. 

Any suggestions for the two issues?

ThanksAnuj

Sent from Yahoo Mail on Android

Re: Migrating to CQL and Non Compact Storage

Reply via email to