Re: Empty snapshot created
Hi Ben, Many thanks. I had tried this command earlier too but I guess my syntax was wrong. This time I just issued nodetool rebuild dc1 and took the snapshot after this. I could see the *.db files created. Earlier I was trying nodetool rebuild -- dc1 which I guess was wrong. Thanks for the lightning response. Regards, Mradul On Fri, Jun 10, 2016 at 9:52 AM, Ben Slaterwrote: > After adding a DC you need to run nodetool rebuild. See the procedure > here: > https://docs.datastax.com/en/cassandra/2.1/cassandra/operations/ops_add_dc_to_cluster_t.html > > Cheers > Ben > > On Fri, 10 Jun 2016 at 14:17 Mradul Maheshwari > wrote: > >> Hi, >> I am facing an issue when taking snapshots. >> >> The details of the setup are as follows >> >>1. Cassandra Version 3.5 >>2. I have a keyspace named *other_map* with '*NetworkTopologyStrategy*' >>and replication factor 1 for 'dc1' >>3. Added another datacenter 'dc2' in the existing cluster >>4. Modified other_map keyspace using the *ALTER* command. >>5. After this Logged on on the node on dc2 datacenter and issued the >> *nodetool >>snapshot* command for the other_map keyspace. >>6. As a result a directory is created in the other_keyspace/>name>/snapshot/. This contains only a manifest,json file which >>has no information about any files. >> >> >> *cat >> data/data/other_map/country-f34a28d02b1511e689afc7a4a4b2ee40/snapshots/1465457086678/manifest.json* >> >> *{"files":[]}* >> >> >> >> Am I missing any thing here? Are the above mentioned steps complete? >> >> After altering the keyspace I have tried a *nodetool repair* command >> which had also not changed anything. >> >> Regards, >> Mradul >> >> >> Information about schema follows >> >> CREATE KEYSPACE other_map WITH replication = {'class': >> 'NetworkTopologyStrategy', 'dc': '1', 'dc2': '2'} AND durable_writes = >> true; >> >> CREATE TABLE other_map.country ( >> id int PRIMARY KEY, >> name text, >> states int >> ) WITH bloom_filter_fp_chance = 0.01 >> AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'} >> AND comment = '' >> AND compaction = {'class': >> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', >> 'max_threshold': '32', 'min_threshold': '4'} >> AND compression = {'chunk_length_in_kb': '64', 'class': >> 'org.apache.cassandra.io.compress.LZ4Compressor'} >> AND crc_check_chance = 1.0 >> AND dclocal_read_repair_chance = 0.1 >> AND default_time_to_live = 0 >> AND gc_grace_seconds = 864000 >> AND max_index_interval = 2048 >> AND memtable_flush_period_in_ms = 0 >> AND min_index_interval = 128 >> AND read_repair_chance = 0.0 >> AND speculative_retry = '99PERCENTILE'; >> >> -- > > Ben Slater > Chief Product Officer, Instaclustr > +61 437 929 798 >
Re: Empty snapshot created
After adding a DC you need to run nodetool rebuild. See the procedure here: https://docs.datastax.com/en/cassandra/2.1/cassandra/operations/ops_add_dc_to_cluster_t.html Cheers Ben On Fri, 10 Jun 2016 at 14:17 Mradul Maheshwariwrote: > Hi, > I am facing an issue when taking snapshots. > > The details of the setup are as follows > >1. Cassandra Version 3.5 >2. I have a keyspace named *other_map* with '*NetworkTopologyStrategy*' >and replication factor 1 for 'dc1' >3. Added another datacenter 'dc2' in the existing cluster >4. Modified other_map keyspace using the *ALTER* command. >5. After this Logged on on the node on dc2 datacenter and issued the > *nodetool >snapshot* command for the other_map keyspace. >6. As a result a directory is created in the other_keyspace/name>/snapshot/. This contains only a manifest,json file which >has no information about any files. > > > *cat > data/data/other_map/country-f34a28d02b1511e689afc7a4a4b2ee40/snapshots/1465457086678/manifest.json* > > *{"files":[]}* > > > > Am I missing any thing here? Are the above mentioned steps complete? > > After altering the keyspace I have tried a *nodetool repair* command > which had also not changed anything. > > Regards, > Mradul > > > Information about schema follows > > CREATE KEYSPACE other_map WITH replication = {'class': > 'NetworkTopologyStrategy', 'dc': '1', 'dc2': '2'} AND durable_writes = > true; > > CREATE TABLE other_map.country ( > id int PRIMARY KEY, > name text, > states int > ) WITH bloom_filter_fp_chance = 0.01 > AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'} > AND comment = '' > AND compaction = {'class': > 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', > 'max_threshold': '32', 'min_threshold': '4'} > AND compression = {'chunk_length_in_kb': '64', 'class': > 'org.apache.cassandra.io.compress.LZ4Compressor'} > AND crc_check_chance = 1.0 > AND dclocal_read_repair_chance = 0.1 > AND default_time_to_live = 0 > AND gc_grace_seconds = 864000 > AND max_index_interval = 2048 > AND memtable_flush_period_in_ms = 0 > AND min_index_interval = 128 > AND read_repair_chance = 0.0 > AND speculative_retry = '99PERCENTILE'; > > -- Ben Slater Chief Product Officer, Instaclustr +61 437 929 798
Empty snapshot created
Hi, I am facing an issue when taking snapshots. The details of the setup are as follows 1. Cassandra Version 3.5 2. I have a keyspace named *other_map* with '*NetworkTopologyStrategy*' and replication factor 1 for 'dc1' 3. Added another datacenter 'dc2' in the existing cluster 4. Modified other_map keyspace using the *ALTER* command. 5. After this Logged on on the node on dc2 datacenter and issued the *nodetool snapshot* command for the other_map keyspace. 6. As a result a directory is created in the other_keyspace//snapshot/. This contains only a manifest,json file which has no information about any files. *cat data/data/other_map/country-f34a28d02b1511e689afc7a4a4b2ee40/snapshots/1465457086678/manifest.json* *{"files":[]}* Am I missing any thing here? Are the above mentioned steps complete? After altering the keyspace I have tried a *nodetool repair* command which had also not changed anything. Regards, Mradul Information about schema follows CREATE KEYSPACE other_map WITH replication = {'class': 'NetworkTopologyStrategy', 'dc': '1', 'dc2': '2'} AND durable_writes = true; CREATE TABLE other_map.country ( id int PRIMARY KEY, name text, states int ) WITH bloom_filter_fp_chance = 0.01 AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'} AND comment = '' AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'} AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND crc_check_chance = 1.0 AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99PERCENTILE';
Re: Interesting use case
The example I gave was for when N=1, if we need to save more values I planned to just add more columns. On Thu, Jun 9, 2016 at 12:51 AM, kurt Greaveswrote: > I would say it's probably due to a significantly larger number of > partitions when using the overwrite method - but really you should be > seeing similar performance unless one of the schemas ends up generating a > lot more disk IO. > If you're planning to read the last N values for an event at the same time > the widerow schema would be better, otherwise reading N events using the > overwrite schema will result in you hitting N partitions. You really need > to take into account how you're going to read the data when you design a > schema, not only how many writes you can push through. > > On 8 June 2016 at 19:02, John Thomas wrote: > >> We have a use case where we are storing event data for a given system and >> only want to retain the last N values. Storing extra values for some time, >> as long as it isn’t too long, is fine but never less than N. We can't use >> TTLs to delete the data because we can't be sure how frequently events will >> arrive and could end up losing everything. Is there any built in mechanism >> to accomplish this or a known pattern that we can follow? The events will >> be read and written at a pretty high frequency so the solution would have >> to be performant and not fragile under stress. >> >> >> >> We’ve played with a schema that just has N distinct columns with one >> value in each but have found overwrites seem to perform much poorer than >> wide rows. The use case we tested only required we store the most recent >> value: >> >> >> >> CREATE TABLE eventyvalue_overwrite( >> >> system_name text, >> >> event_name text, >> >> event_time timestamp, >> >> event_value blob, >> >> PRIMARY KEY (system_name,event_name)) >> >> >> >> CREATE TABLE eventvalue_widerow ( >> >> system_name text, >> >> event_name text, >> >> event_time timestamp, >> >> event_value blob, >> >> PRIMARY KEY ((system_name, event_name), event_time)) >> >> WITH CLUSTERING ORDER BY (event_time DESC) >> >> >> >> We tested it against the DataStax AMI on EC2 with 6 nodes, replication 3, >> write consistency 2, and default settings with a write only workload and >> got 190K/s for wide row and 150K/s for overwrite. Thinking through the >> write path it seems the performance should be pretty similar, with probably >> smaller sstables for the overwrite schema, can anyone explain the big >> difference? >> >> >> >> The wide row solution is more complex in that it requires a separate >> clean up thread that will handle deleting the extra values. If that’s the >> path we have to follow we’re thinking we’d add a bucket of some sort so >> that we can delete an entire partition at a time after copying some values >> forward, on the assumption that deleting the whole partition is much better >> than deleting some slice of the partition. Is that true? Also, is there >> any difference between setting a really short ttl and doing a delete? >> >> >> >> I know there are a lot of questions in there but we’ve been going back >> and forth on this for a while and I’d really appreciate any help you could >> give. >> >> >> >> Thanks, >> >> John >> > > > > -- > Kurt Greaves > k...@instaclustr.com > www.instaclustr.com >
Re: Consistency level ONE and using withLocalDC
Hi Alain, Thank you for your answer. I recently queried multiple times my cluster with consistency ONE and setting "myLocalDC" (withUsedHostsPerRemoteDc=1) However sometimes (not always) I got response from the node in the remote DC. All my nodes in "myLocalDC" were up and running. I was facing an data inconsistency issue. When connecting to the remote node I got empty result, while when connecting to "myLocalDC" I got the expected result back. I was expecting that since all nodes in "myLocalDC" were up and running, no attempt would have been made to the remote node. I had to solve the problem by setting consistency "LOCAL_ONE" till I repair the remote node. Or I could alternatively have set withUsedHostsPerRemoteDc=0. Kind regards, George On Wed, Jun 8, 2016 at 7:10 PM, Alain RODRIGUEZwrote: > Hi George, > > Would that be correct? > > > I think it is actually quite the opposite :-). > > It is very well explained here: > https://docs.datastax.com/en/drivers/java/2.0/com/datastax/driver/core/policies/DCAwareRoundRobinPolicy.Builder.html#withUsedHostsPerRemoteDc-int- > > Connection is opened to the X nodes in the remote DC. But it will only be > used to indeed do a local operation as a fallback if the operation is not > using a LOCAL_* consistency level. > > Sorry I have been so long answering you. > > --- > Alain Rodriguez - al...@thelastpickle.com > France > > The Last Pickle - Apache Cassandra Consulting > http://www.thelastpickle.com > > 2016-05-20 17:54 GMT+02:00 George Sigletos : > >> Hello, >> >> Using withLocalDC="myLocalDC" and withUsedHostsPerRemoteDc>0 will >> guarantee that you will connect to one of the nodes in "myLocalDC", >> >> but DOES NOT guarantee that your read/write request will be acknowledged >> by a "myLocalDC" node. It may well be acknowledged by a remote DC node as >> well, even if "myLocalDC" is up and running. >> >> Would that be correct? Thank you >> >> Kind regards, >> George >> > >