Re: Alter table

2018-07-30 Thread Jeff Jirsa
This is safe (and normal, and good) in all versions except those impacted
by https://issues.apache.org/jira/browse/CASSANDRA-13004

So if you're on 2.1, 2.2, or 3.11 you're fine

If you're on 3.0 between 3.0.0 and 3.0.13, you should upgrade first (to
newest 3.0, probably 3.0.17)
If you're on a version between 3.1 and 3.10, you should upgrade first (to
newest 3.11, probably 3.11.3)

- Jeff


On Mon, Jul 30, 2018 at 10:16 PM, Visa  wrote:

> Hi all,
>
> I have one question about altering schema. If we only add columns, is it
> ok to alter the schema while the writes to the table are happening at the
> same time? We can control that the writes will not touch the new columns
> until the schema change is done. Or better to stop the writes to that table
> first.
>
> Thanks!
>
> Li
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>


Alter table

2018-07-30 Thread Visa
Hi all,

I have one question about altering schema. If we only add columns, is it ok to 
alter the schema while the writes to the table are happening at the same time? 
We can control that the writes will not touch the new columns until the schema 
change is done. Or better to stop the writes to that table first.

Thanks!

Li
-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



RE: [EXTERNAL] Server kernal Parameters for cassandra

2018-07-30 Thread Durity, Sean R
Here are some to review and test for Cassandra 3.x from DataStax:
https://docs.datastax.com/en/dse/5.1/dse-admin/datastax_enterprise/config/configRecommendedSettings.html

Al Tobey has done extensive work in this area, too. This is dated (Cassandra 
2.1), but is worth mining for information:
https://tobert.github.io/pages/als-cassandra-21-tuning-guide.html


Sean Durity

-Original Message-
From: rajasekhar kommineni 
Sent: Sunday, July 29, 2018 7:36 PM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Server kernal Parameters for cassandra

Hello,

Do we have any standard values for server kernel parameters to run Cassandra. 
Please share some insight.

Thanks,


-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org




The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: Infinite loop of single SSTable compactions

2018-07-30 Thread Martin Mačura
Hi Rahul,

the table TTL is 24 months. Oldest data is 22 months, so no
expirations yet.  Compacted partition maximum bytes: 17 GB - yeah, I
know that's not good, but we'll have to wait for the TTL to make it go
away.  More recent partitions are kept under 100 MB by bucketing.

The data model:
CREATE TABLE keyspace.table (
   group int,
   status int,
   bucket timestamp,
   ts timeuuid,
   source int,
...
   PRIMARY KEY ((group, status, bucket), ts, source)
) WITH CLUSTERING ORDER BY (ts DESC, monitor ASC)

There are no INSERT statements with the same 'ts' and 'source'
clustering columns.

Regards,

Martin
On Thu, Jul 26, 2018 at 12:16 PM Rahul Singh
 wrote:
>
> Few questions
>
>
> What is your maximumcompactedbytes across the cluster for this table ?
> What’s your TTL ?
> What does your data model look like as in what’s your PK?
>
> Rahul
> On Jul 25, 2018, 1:07 PM -0400, James Shaw , wrote:
>
> nodetool compactionstats  --- see compacting which table
> nodetool cfstats keyspace_name.table_name  --- check partition side, 
> tombstones
>
> go the data file directories:  look the data file size, timestamp,  --- 
> compaction will write to new temp file with _tmplink...,
>
> use sstablemetadata ...    look the largest or oldest one first
>
> of course, other factors may be,  like disk space, etc
> also what are compaction_throughput_mb_per_sec in cassandra.yaml
>
> Hope it is helpful.
>
> Thanks,
>
> James
>
>
>
>
> On Wed, Jul 25, 2018 at 4:18 AM, Martin Mačura  wrote:
>>
>> Hi,
>> we have a table which is being compacted all the time, with no change in 
>> size:
>>
>> Compaction History:
>> compacted_atbytes_inbytes_out   rows_merged
>> 2018-07-25T05:26:48.101 57248063878 57248063878 {1:11655}
>>
>>   2018-07-25T01:09:47.346 57248063878 57248063878
>> {1:11655}
>>  2018-07-24T20:52:48.652
>> 57248063878 57248063878 {1:11655}
>>
>> 2018-07-24T16:36:01.828 57248063878 57248063878 {1:11655}
>>
>>   2018-07-24T12:11:00.026 57248063878 57248063878
>> {1:11655}
>>  2018-07-24T07:28:04.686
>> 57248063878 57248063878 {1:11655}
>>
>> 2018-07-24T02:47:15.290 57248063878 57248063878 {1:11655}
>>
>>   2018-07-23T22:06:17.410 57248137921 57248063878
>> {1:11655}
>>
>> We tried setting unchecked_tombstone_compaction to false, had no effect.
>>
>> The data is a time series, there will be only a handful of cell
>> tombstones present. The table has a TTL, but it'll be least a month
>> before it takes effect.
>>
>> Table properties:
>>AND compaction = {'class':
>> 'org.apache.cassandra.db.compaction.TimeWindowCompactionStrategy',
>> 'compaction_window_size': '1', 'compaction_window_unit': 'DAYS',
>> 'max_threshold': '32', 'min_threshold': '4',
>> 'unchecked_tombstone_compaction': 'false'}
>>AND compression = {'chunk_length_in_kb': '64', 'class':
>> 'org.apache.cassandra.io.compress.LZ4Compressor'}
>>AND crc_check_chance = 1.0
>>AND dclocal_read_repair_chance = 0.0
>>AND default_time_to_live = 63072000
>>AND gc_grace_seconds = 10800
>>AND max_index_interval = 2048
>>AND memtable_flush_period_in_ms = 0
>>AND min_index_interval = 128
>>AND read_repair_chance = 0.0
>>AND speculative_retry = 'NONE';
>>
>> Thanks for any help
>>
>>
>> Martin
>>
>> -
>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: user-h...@cassandra.apache.org
>>
>

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: Re: Data model storage optimization

2018-07-30 Thread James Shaw
considering:
row size large or not
update a lot or not   - update is insert actually
read heavy or not
overall read performance

if row size large , you may consider table:user_detail , add column id in
all tables.
In application side, merge/join by id.
But paid read price, 2nd query to user_detail.

Just my 2 cents.  hope helpful.

Thanks,

James


On Sun, Jul 29, 2018 at 11:20 PM, onmstester onmstester  wrote:

>
> How many rows in average per partition?
>
> around 10K.
>
>
> Let me get this straight : You are bifurcating your partitions on either
> email or username , essentially potentially doubling the data because you
> don’t have a way to manage a central system of record of users ?
>
> We are just analyzing output logs of a "perfectly" running application!,
> so no one let me change its data design, i thought maybe it would be a more
> general problem for cassandra users that someone both
> 1. needed to access a identical set of columns by multiple keys (all the
> keys should be present in rows)
> 2. there was a storage limit (due to TTL * input rate would be some TBs)
> I know that there is a strict rule in cassandra data modeling : "never use
> foreign keys and sacrifice disk instead", but anyone ever been forced to do
> such a thing and How?
>
>


restoring with commitlogs in C* 3.11

2018-07-30 Thread Jean Carlo
Hello everyone,

I am testing the procedure of restoring a table using the commitlogs
without success.

I am following this doc (even if this is for DSE  )

https://support.datastax.com/hc/en-us/articles/115001593706-Manual-Backup-and-Restore-with-Point-in-time-and-table-level-restore-


I am probably missing something. Is there anyone who is using the commitlog
to make point in time restoring ?
Is there a procedure to make it works ?


This is my procedure.


1- Add this line to cassandra-env.sh JVM_OPTS="$JVM_OPTS
-Dcassandra.replayList=pns_nonreg_bench.cf1"

2- I edit the file /etc/cassandra/commitlog_archiving.properties like this

archive_command=/bin/ln %path /disk2/cassandra/commit_log_backup/%name
restore_command=cp -f %from %to
restore_directories=/disk1/cassandra/commitlog/
restore_point_in_time=
precision=MICROSECONDS

3- I restart all the nodes

4- I made a delete on cassandra and the modify the restore_point_in_time to
set it before the delete
restore_point_in_time=2018:07:30 09:54:00 # before the delete


5- then I restart cassandra and I just see in the logs this

INFO  [main] 2018-07-30 10:40:43,652 CommitLog.java:157 - Replaying
/var/opt/hosting/db/cassandra/commitlog/CommitLog-6-1532697162576.log
INFO  [main] 2018-07-30 10:40:43,705 CommitLog.java:159 - Log replay
complete, 0 replayed mutations

I did not get the data back, (re write it ).


For infor I made the procedure using snapshots and the replaying the
commitlogs without success


Thank you very much

Jean Carlo

"The best way to predict the future is to invent it" Alan Kay