from:"Sebastian Estevez"

sstable-to-arrow

2021-07-28 Thread Sebastian Estevez

Hi folks,

There was some discussion on here a couple of weeks ago about using the
Apache Arrow in memory format for Cassandra data so I thought I'd share the
following posts / code we just released as alpha (apache 2 license).


Code:
https://github.com/datastax/sstable-to-arrow

Post Part 1:
https://www.datastax.com/blog/analyzing-cassandra-data-using-gpus-part-1
Post Part 2:
https://www.datastax.com/blog/analyzing-cassandra-data-using-gpus-part-2

I also think the cross language sstable parsing code and visual
documentation is a tremendous contribution for the project and would love
to see more folks pick it up and use it for other purposes.

If anyone is interested feel free to reach out or join our live workshop on
this topic in mid August:

https://www.eventbrite.com/e/workshop-analyzing-cassandra-data-with-gpus-tickets-164294668777

--Seb

Re: Nodetool clearsnapshot does not delete snapshot for dropped column_family

2020-04-30 Thread Sebastian Estevez

Perhaps you had a DDL collision and ended up with two data dirs for the
table?

In that case running drop table would only move the active table directory
to snapshots and as Eric suggested would leave the data in the duplicate
directory "orphaned".

I haven't tried to reproduce this yet but I think given how DDL works in c*
it checks out as a possible scenario.

Keep calm and Cassandra on folks,

Seb


On Thu, Apr 30, 2020, 7:46 PM Sergio  wrote:

> The problem is that folder is not under snapshot but it is under the data
> path.
> I tried with the --all switch too
> Thanks,
> Sergio
>
> On Thu, Apr 30, 2020, 4:21 PM Nitan Kainth  wrote:
>
>> I don't think it works like that. clearsnapshot --all would remove all
>> snapshots. Here is an example:
>>
>> $ ls -l
>> /ss/xx/cassandra/data/ww/a-5bf825428b3811eabe0c6b7631a60bb0/snapshots/
>>
>> total 8
>>
>> drwxr-xr-x 2 cassandra cassandra 4096 Apr 30 23:17
>> dropped-1588288650821-a
>>
>> drwxr-xr-x 2 cassandra cassandra 4096 Apr 30 23:17 manual
>>
>> $ nodetool clearsnapshot --all
>>
>> Requested clearing snapshot(s) for [all keyspaces] with [all snapshots]
>>
>> $ ls -l
>> /ss/xx/cassandra/data/ww/a-5bf825428b3811eabe0c6b7631a60bb0/snapshots/
>>
>> ls: cannot access
>> /ss/xx/cassandra/data/ww/a-5bf825428b3811eabe0c6b7631a60bb0/snapshots/:
>> No such file or directory
>>
>> $
>>
>>
>> On Thu, Apr 30, 2020 at 5:44 PM Erick Ramirez 
>> wrote:
>>
>>> Yes, you're right. It doesn't show up in listsnapshots nor does
>>> clearsnapshot remove the dropped snapshot because the table is no
>>> longer managed by C* (because it got dropped). So you will need to manually
>>> remove the dropped-* directories from the filesystem.
>>>
>>> Someone here will either correct me or hopefully provide a
>>> user-friendlier solution. Cheers!
>>>
>>

Re: Table not updating

2020-03-23 Thread Sebastian Estevez

I have seen cases where folks thought they were writing successfully to the
database but were really hitting timeouts due to an unhandled future in
their loading program. This may very well not be your issue but it's common
enough that I thought I would mention it.

Hope you get to the bottom of it!

All the best,



Sebastián Estévez


On Mon, Mar 23, 2020 at 8:50 PM Jeff Jirsa  wrote:

> You need to see what's in that place, it could be:
>
> 1) Delete in the future (viewable with SELECT WRITETIME(column) ...). This
> could be clock skew or using the wrong resolution timestamps (millis vs
> micros)
> 2) Some form of corruption if you dont have compression + crc check
> chance. It's possible (but unlikely) that you can have a really broken data
> file that simulates a deletion marker. You may be able to find this with
> sstable2json (older versions) or sstabledump (3.0+)
>
> sstabledump your data files that have the key (nodetool getendpoints,
> nodetool getsstables, sstabledump), look for something unusual.
>
>
>
> On Mon, Mar 23, 2020 at 4:00 PM Oliver Herrmann 
> wrote:
>
>> Hello,
>>
>> we are facing a strange issue in one of our Cassandra clusters.
>> We are using prepared statements to update a table with consistency local
>> quorum. When updating some tables it happes very often that data values are
>> not written to the database. When verifying the table using cqlsh (with
>> consistency all) the row does not exist.
>> When using the prepared statements we do not bind values to all
>> placeholder for data columns but I think this should not be a problem,
>> right?
>>
>> I checked system.log and debug.log for any hints but nothing is written
>> into these log files.
>> It's only happening in one specific cluster. When running the same
>> software in other clusters everything is working fine.
>>
>> We are using Cassanda server version 3.11.1 and datastax cpp driver
>> 2.13.0.
>>
>> Any idea how to analyze/fix this problem?
>>
>> Regards
>> Oliver
>>
>>

Re: GraalVM

2019-05-09 Thread Sebastian Estevez

Hope you find those helpful :)

If you missed us, these things are recorded
https://www.twitch.tv/datastaxacademy/videos


All the best,



Sebastián Estévez | Vanguard Solution Architect

Mobile +1.954.905.8615

sebastian.este...@datastax.com  |  datastax.com

<https://www.linkedin.com/company/datastax>
<https://www.facebook.com/datastax> <https://twitter.com/datastax>
<http://feeds.feedburner.com/datastax> <https://github.com/datastax/>

<https://www.cvent.com/events/datastax-accelerate/registration-15f868df73ed46488d1d159da20858d4.aspx?r=cc6bf9aa-c933-4d93-9501-904fd550e1ad&refid=mainreglink&fqp=true>

20% Discount Code: estevez20


On Thu, May 9, 2019 at 1:19 PM Chris Hane  wrote:

> Awesome.  Will try to join.
>
> Thanks for the links.  Will look through them also.
>
> On Thu, May 9, 2019 at 8:33 AM Sebastian Estevez <
> sebastian.este...@datastax.com> wrote:
>
>> Hi Chris,
>>
>> Funny you mention this today of all days because we're doing a twitch
>> streaming session in a few hours on this very topic with Adron from our
>> dev-rel team.
>>
>> The short answer is yes it works. Here's the example project we're
>> working on that uses GraalVM via quarkus.io
>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__quarkus.io&d=DwMFaQ&c=adz96Xi0w1RHqtPMowiL2g&r=gH-IK_otrGtjAo1mJrnEP1bXaeKATcXEiyjRwmQ7a4k&m=qrNUAM8G1jXEgtfYaByLuiUIMir_7qUzA3VOdogQFfA&s=XakWmesuHSYR_zDj6pKZktSzslTGnKVUOCY_yFvCwGM&e=>
>> : https://github.com/phact/sebulba
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_phact_sebulba&d=DwMFaQ&c=adz96Xi0w1RHqtPMowiL2g&r=gH-IK_otrGtjAo1mJrnEP1bXaeKATcXEiyjRwmQ7a4k&m=qrNUAM8G1jXEgtfYaByLuiUIMir_7qUzA3VOdogQFfA&s=2Z4UKOZfC8d01Rq-gFfp1nAGGMCmdLJwuHheIR2UW5c&e=>
>>  (it's
>> the app we'll be using for the Drone Race at Accelerate).
>>
>> Here's the bit of code where I wrap the datastax java driver
>> https://github.com/phact/sebulba/blob/master/src/main/java/com/datastax/powertools/managed/DSEManager.java
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_phact_sebulba_blob_master_src_main_java_com_datastax_powertools_managed_DSEManager.java&d=DwMFaQ&c=adz96Xi0w1RHqtPMowiL2g&r=gH-IK_otrGtjAo1mJrnEP1bXaeKATcXEiyjRwmQ7a4k&m=qrNUAM8G1jXEgtfYaByLuiUIMir_7qUzA3VOdogQFfA&s=3rqQ-8G0FFlPObiV6_cHlErY67VNflGiwI4qCDkyETY&e=>
>>
>> The remaining code is just statements and business logic. Very little
>> boiler plate. So far I really like it from a dev experience perspective,
>> especially the hot reloading you get for backend code with quarkus.
>>
>> Feel free to join us at 3pm EST to see us code and ask questions
>> https://www.twitch.tv/datastaxacademy
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__www.twitch.tv_datastaxacademy&d=DwMFaQ&c=adz96Xi0w1RHqtPMowiL2g&r=gH-IK_otrGtjAo1mJrnEP1bXaeKATcXEiyjRwmQ7a4k&m=qrNUAM8G1jXEgtfYaByLuiUIMir_7qUzA3VOdogQFfA&s=99B7FD7NGBSM_j-_LT8S3VV4JNt2yOn6-TUTqMr1pf0&e=>
>>
>> All the best,
>>
>>
>>
>> Sebastián Estévez | Vanguard Solution Architect
>>
>> Mobile +1.954.905.8615
>>
>> sebastian.este...@datastax.com  |  datastax.com
>>
>>
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__www.linkedin.com_company_datastax&d=DwMFaQ&c=adz96Xi0w1RHqtPMowiL2g&r=gH-IK_otrGtjAo1mJrnEP1bXaeKATcXEiyjRwmQ7a4k&m=qrNUAM8G1jXEgtfYaByLuiUIMir_7qUzA3VOdogQFfA&s=OeLWAOixQ5I1PPi8AvaL3PD-nxQIZqvInGVQYhLCCXQ&e=>
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__www.facebook.com_datastax&d=DwMFaQ&c=adz96Xi0w1RHqtPMowiL2g&r=gH-IK_otrGtjAo1mJrnEP1bXaeKATcXEiyjRwmQ7a4k&m=qrNUAM8G1jXEgtfYaByLuiUIMir_7qUzA3VOdogQFfA&s=I-rA4ggk6hXamnTsS-_XJnu_4DYLPELC39fPFyIp7UI&e=>
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__twitter.com_datastax&d=DwMFaQ&c=adz96Xi0w1RHqtPMowiL2g&r=gH-IK_otrGtjAo1mJrnEP1bXaeKATcXEiyjRwmQ7a4k&m=qrNUAM8G1jXEgtfYaByLuiUIMir_7qUzA3VOdogQFfA&s=LZlsdu36I85vqCehsj8VSdtE5lXGzIGqDWmoXZ8kGRc&e=>
>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__feeds.feedburner.com_datastax&d=DwMFaQ&c=adz96Xi0w1RHqtPMowiL2g&r=gH-IK_otrGtjAo1mJrnEP1bXaeKATcXEiyjRwmQ7a4k&m=qrNUAM8G1jXEgtfYaByLuiUIMir_7qUzA3VOdogQFfA&s=n9GNJgI9Q2Whge9pPS640fHI3yGA3cduIhh8fEHDM-E&e=>
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_datastax_&d=DwMFaQ&c=adz96Xi0w1RHqtPMowiL2g&r=gH-IK_otrGtjAo1mJrnEP1bXaeKATcXEiyjRwmQ7a4k&m=qrNUAM8G1jXEgtfYaByLuiUIMir_7qUzA3VOdogQFfA&s=kxbFOfPg5TatVpDnwYOizPXPFwujwmpjiLiwlr5B6-c&e=>
>>
>

Re: GraalVM

2019-05-09 Thread Sebastian Estevez

Hi Chris,

Funny you mention this today of all days because we're doing a twitch
streaming session in a few hours on this very topic with Adron from our
dev-rel team.

The short answer is yes it works. Here's the example project we're working
on that uses GraalVM via quarkus.io: https://github.com/phact/sebulba (it's
the app we'll be using for the Drone Race at Accelerate).

Here's the bit of code where I wrap the datastax java driver
https://github.com/phact/sebulba/blob/master/src/main/java/com/datastax/powertools/managed/DSEManager.java

The remaining code is just statements and business logic. Very little
boiler plate. So far I really like it from a dev experience perspective,
especially the hot reloading you get for backend code with quarkus.

Feel free to join us at 3pm EST to see us code and ask questions
https://www.twitch.tv/datastaxacademy

All the best,

Sebastián Estévez | Vanguard Solution Architect

Mobile +1.954.905.8615

sebastian.este...@datastax.com  |  datastax.com

20% Discount Code: estevez20

On Thu, May 9, 2019 at 12:51 AM Chris Hane  wrote:

>
> Has anyone worked with graalvm to include a cql driver in the native-image
> build?
>
> Looking to see if it is possible or known to not be possible?
>
> Thanks,
> Chris
>

Re: How to measure the schema size

2018-10-26 Thread Sebastian Estevez

Here's the metrics you want:
http://cassandra.apache.org/doc/latest/operating/metrics.html#table-metrics

The best practice is to run fewer bigger tables. If it's a lot of tables
you're likely out of luck aside from throwing more RAM at the problem.

All the best,



Sebastián Estévez | Vanguard Solution Architect

Mobile +1.954.905.8615

sebastian.este...@datastax.com  |  datastax.com


 
 



20% Discount Code: estevez20


On Fri, Oct 26, 2018 at 6:25 PM Jai Bheemsen Rao Dhanwada <
jaibheem...@gmail.com> wrote:

> anyone has any idea on this?
>
> On Thu, Oct 25, 2018 at 11:35 AM Jai Bheemsen Rao Dhanwada <
> jaibheem...@gmail.com> wrote:
>
>> Hello,
>>
>> I am running into a situation where huge schema (# of CF) causing OOM
>> issues to the heap. is there a way to measure how much size each column
>> family uses in the heap?
>>
>

Re: token distribution in multi-dc

2017-05-02 Thread Sebastian Estevez

Hi Justin,

> each DC will have a complete token ring

This statement--and the diagram--imply that you can have multiple nodes
with the same token as long as they are in seperate DCs. This isn't the
case, though I understand how it is easy to fall into that trap.

To help reason about this consider the following:

If you could have two nodes (even across DC's) with the same token (T), how
would you determine the primary replica for a global consistency level
query that hashes to that token (T) +1? How would you calculate the ranges
for a global range repair?

You can either view ranges within a datacenter (for local queries and local
repairs) or you can view ranges cluster wide (for global queries and global
repairs). It is also useful sometimes to think of two DC's as two rings, we
draw them like that all the time. However, there is really only one ring
(to rule them all) which means token collisions are not allowed.

> setup with V-nodes enabled

Fortunately Vasu's configuration uses vnodes and so it doesn't really
matter. The vnode allocation algorithm will ensure that there are no
collisions. Unless you are hardcoding your vnode tokens in the yaml
manually, in which case I'd be curious to understand why.

All the best,

Sebastián

On Tue, May 2, 2017 at 8:45 PM, Justin Cameron 
wrote:

> That's correct - each DC will have a complete token ring. You can think of
> a Cassandra data center as effectively it's own self-contained
> "mini-cluster". See diagram below (diagram assumes RF3 in both DCs and a
> token range of only 0-99 for readability).
>
> [image: 2dc.png]
>
> On Wed, 3 May 2017 at 08:07 vasu gunja  wrote:
>
>> I'm confused now. please someone confirm with proof.
>>
>> On Tue, May 2, 2017 at 4:54 PM, vasu gunja  wrote:
>>
>>> In that case there will be duplication of tokens ranges will present
>>> cluster right ?.
>>> Please prove
>>>
>>>
>>> On Mon, May 1, 2017 at 7:54 PM, Justin Cameron 
>>> wrote:
>>>
 Hi Vasu,

 Each DC has a complete token range.

 Cheers,
 Justin

 On Tue, 2 May 2017 at 06:32 vasu gunja  wrote:

> Hi ,
>
> I have a question regarding token distribution in muti-dc setup.
>
> We are having multi-dc (DC1+DC2) setup with V-nodes enabled.
> How token ranges will be distributed in cluster ?
>
> Is complete cluster has completed one token range ?
> Or each DC has complete token  range?
>
>
> --

 *Justin Cameron*Senior Software Engineer

 This email has been sent on behalf of Instaclustr Pty. Limited
 (Australia) and Instaclustr Inc (USA).

 This email and any attachments may contain confidential and legally
 privileged information.  If you are not the intended recipient, do not copy
 or disclose its content, but please reply to this email immediately and
 highlight the error to the sender and then immediately delete the message.

>>>
>>>
>> --
>
>
> *Justin Cameron*Senior Software Engineer
>
>
> 
>
>
> This email has been sent on behalf of Instaclustr Pty. Limited (Australia)
> and Instaclustr Inc (USA).
>
> This email and any attachments may contain confidential and legally
> privileged information.  If you are not the intended recipient, do not copy
> or disclose its content, but please reply to this email immediately and
> highlight the error to the sender and then immediately delete the message.
>

Re: what does this Note mean?

2016-10-13 Thread Sebastian Estevez

It means for accurate load statistics add the keyspace name after nodetool
status.

nodetool status 

All the best,

On Thu, Oct 13, 2016 at 10:08 PM, Kant Kodali  wrote:

>
> Note: Non-system keyspaces don't have the same replication settings,
> effective ownership information is meaningless
>

Re: Are updates on map columns causing tombstones?

2016-07-12 Thread Sebastian Estevez

I did a short write-up on this.

http://www.sestevez.com/on-cassandra-collections-updates-and-tombstones/

all the best,

Sebastián
On Jul 11, 2016 4:22 PM, "Alain RODRIGUEZ"  wrote:

> Hi Jan,
>
>
>> when I replace the content of a map-valued column (when I replace the
>> complete map), will this create tombstones for those map entries that are
>> not present in the new map?
>
>
> This is almost correct, I would be more precise and say that it will
> create a range tombstone on this map, and create a new one. No merge. To
> have the behaviour you described, you have to set the map though:
>
> // Updating (or inserting)
> UPDATE users SET favs['author'] = 'Ed Poe' WHERE id = 'jsmith'
> UPDATE users SET favs = favs -  { 'movie' : 'Cassablanca' } WHERE id = 
> 'jsmith'
>
> ==> Tombstone on some specific cells or adding a value in a collection. 
> Previous data is kept.
>
> *Instead of*
>
> // Updating (or inserting)
>
> INSERT INTO users (id, given, surname, favs) VALUES ('jsmith', 'John', 
> 'Smith', { 'fruit' : 'apple', 'band' : 'Beatles' })
> UPDATE users SET favs = { 'movie' : 'Cassablanca' } WHERE id = 'jsmith'
>
> ==> Range tombstone on previous collection (the whole thing). Tombstone + new 
> write (no merge). No explicit delete but dropping data this can be unexpected 
> and hard to troubleshot.
>
> C*heers,
>
> ---
> Alain Rodriguez - al...@thelastpickle.com
> France
>
> The Last Pickle - Apache Cassandra Consultinghttp://www.thelastpickle.com
>
> 2016-07-11 16:51 GMT+02:00 Matthias Niehoff <
> matthias.nieh...@codecentric.de>:
>
>> Hi,
>>
>> it depends.
>> - If you defined the type as a frozen there will not be tombstone,
>> as the map is stored as one binary blob. The update is handled as a normal
>> upsert.
>> - If you do not use the frozen keyword you are right. There will be range
>> tombstones for all columns that have been deleted or updated.
>>
>> 2016-07-11 16:16 GMT+02:00 Jan Algermissen :
>>
>>> Hi,
>>>
>>> when I replace the content of a map-valued column (when I replace the
>>> complete map), will this create tombstones for those map entries that are
>>> not present in the new map?
>>>
>>> My expectation is 'yes', because the map is laid out as normal columns
>>> internally so keys not in the new map should lead to a delete.
>>>
>>> Is that correct?
>>>
>>> Jan
>>>
>>
>>
>>
>> --
>> Matthias Niehoff | IT-Consultant | Agile Software Factory  | Consulting
>> codecentric AG | Zeppelinstr 2 | 76185 Karlsruhe | Deutschland
>> tel: +49 (0) 721.9595-681 | fax: +49 (0) 721.9595-666 | mobil: +49 (0)
>> 172.1702676
>> www.codecentric.de | blog.codecentric.de | www.meettheexperts.de |
>> www.more4fi.de
>>
>> Sitz der Gesellschaft: Solingen | HRB 25917| Amtsgericht Wuppertal
>> Vorstand: Michael Hochgürtel . Mirko Novakovic . Rainer Vehns
>> Aufsichtsrat: Patric Fedlmeier (Vorsitzender) . Klaus Jäger . Jürgen
>> Schütz
>>
>> Diese E-Mail einschließlich evtl. beigefügter Dateien enthält
>> vertrauliche und/oder rechtlich geschützte Informationen. Wenn Sie nicht
>> der richtige Adressat sind oder diese E-Mail irrtümlich erhalten haben,
>> informieren Sie bitte sofort den Absender und löschen Sie diese E-Mail und
>> evtl. beigefügter Dateien umgehend. Das unerlaubte Kopieren, Nutzen oder
>> Öffnen evtl. beigefügter Dateien sowie die unbefugte Weitergabe dieser
>> E-Mail ist nicht gestattet
>>
>
>

Re: Problems with nodetool

2016-06-29 Thread Sebastian Estevez

Did you mean `nodetool status` not `node-tool status` ?

All the best,


[image: datastax_logo.png] 

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png]  [image:
facebook.png]  [image: twitter.png]
 [image: g+.png]







DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Wed, Jun 29, 2016 at 6:42 AM, Ralf Meier  wrote:

> Hi everybody,
>
> I tried to install a cassandra cluster using docker (official image) on 6
> different machines. (Each physical machine will host one docker container).
> Each physical node has two network cards. One for an „internal network“
> where the cassandra cluster should use for communication. (IP: 10.20.39.1
> to x.x.x.6)
> Through one port conflict on the host machine I had to change the port
> 7000 in cassandra,yaml to 7002 for the communication between the nodes.
>
> The docker containers spined up with out any issues. On each node.
>
> Now I tried to check if all nodes could communicate to each other by using
> the "node-tool status“ command. But always when I entered the command I
> got as output only the help information how to use the node-tool.  (Even
> if I add -p 7002 it does not help)
> I did not get any status about the cluster.
>
> So from now, I did not find anything in the logs, but I could also not
> check the status of the cluster.
>
> Did somebody from have an idea how to change the configuration or what I
> have to change that the cluster is working?
>
> Thanks for your help
> BR
> Ralf
>
>
>
> Attached find the configuration which where set in cassandra.yaml (from
> node 1 which should also act as seed node)
> cluster_name: 'TestCluster'
> num_tokens: 256
> max_hint_window_in_ms: 1080 # 3 hours
> hinted_handoff_throttle_in_kb: 1024
> max_hints_delivery_threads: 2
> hints_flush_period_in_ms: 1
> max_hints_file_size_in_mb: 128
> batchlog_replay_throttle_in_kb: 1024
> authenticator: AllowAllAuthenticator
> authorizer: AllowAllAuthorizer
> role_manager: CassandraRoleManager
> roles_validity_in_ms: 2000
> permissions_validity_in_ms: 2000
> credentials_validity_in_ms: 2000
> partitioner: org.apache.cassandra.dht.Murmur3Partitioner
> data_file_directories:
> - /var/lib/cassandra/data
> commitlog_directory: /var/lib/cassandra/commitlog
> disk_failure_policy: stop
> commit_failure_policy: stop
> prepared_statements_cache_size_mb:
> thrift_prepared_statements_cache_size_mb:
> key_cache_size_in_mb:
> key_cache_save_period: 14400
> row_cache_size_in_mb: 0
> row_cache_save_period: 0
> counter_cache_size_in_mb:
> counter_cache_save_period: 7200
> saved_caches_directory: /var/lib/cassandra/saved_caches
> commitlog_sync: periodi
> commitlog_sync_period_in_ms: 1
> commitlog_segment_size_in_mb: 32
> seed_provider:
> - class_name: org.apache.cassandra.locator.SimpleSeedProvider
>   parameters:
>   - seeds: "10.20.39.1"
> concurrent_reads: 32
> concurrent_writes: 32
> concurrent_counter_writes: 32
> concurrent_materialized_view_writes: 32
> memtable_allocation_type: heap_buffers
> index_summary_capacity_in_mb:
> index_summary_resize_interval_in_minutes: 60
> trickle_fsync: false
> trickle_fsync_interval_in_kb: 10240
> storage_port: 7002
> ssl_storage_port: 7001
> listen_address: 10.20.39.1
> broadcast_address: 10.20.39.1
> start_rpc: false
> rpc_address: 0.0.0.0
> rpc_port: 9160
> broadcast_rpc_address: 10.20.39.1
> rpc_keepalive: true
> rpc_server_type: sync
> thrift_framed_transport_size_in_mb: 15
> incremental_backups: false
> snapshot_before_compaction: false
> auto_snapshot: true
> column_index_size_in_kb: 64
> column_index_cache_size_in_kb: 2
> compaction_throughput_mb_per_sec: 16
> sstable_preemptive_open_interval_in_mb: 50
> read_request_timeout_in_ms: 5000
> range_request_timeout_in_ms: 1
> write_request_timeout_in_ms: 2000
> counter_write_request_timeout_in_ms: 5000
> cas_contention_timeout_in_ms: 1000
> truncate_request_timeout_in_ms: 6
> request_timeout_in_ms: 1
> cross_node_timeout: false
> endpoint_snitch: SimpleSnitch
> dynamic_snitch_update_interval_in_ms: 100
> dynamic_snitch_reset_interval_in_ms: 60
> dynamic_snitch_badness_threshold: 0.1
> request_scheduler: org.apache.cassandra.scheduler.NoScheduler
> server_encryption_options:
> internode_encryption: none
> keystore: conf/.keystore
>

Re: Slow nodetool response time

2016-06-22 Thread Sebastian Estevez

Sounds like your process is spending a lot of time in blocked state (real -
user - sys). Check your os subsystems, maybe your machine is bogged down by
other work.

FWIW, my time in an idle system is about 2 seconds but can go up to ~13
seconds on a busy system with 70% cpu utilized. No difference between 1 and
3 node setups.

All the best,

[image: datastax_logo.png] 

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png]  [image:
facebook.png]  [image: twitter.png]
 [image: g+.png]

DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Wed, Jun 22, 2016 at 9:03 AM, Bhuvan Rawal  wrote:

> Hi,
>
> We have been facing slowness in getting response from nodetool for any of
> its subcommand. On the same version on AWS it responds really fast but on
> local 1 node machine or local DC cluster it performs very slow.
>
> On Local DC :
> *$ time nodetool version*
> ReleaseVersion: 3.0.3
>
> real 0m*17.582s*
> user 0m2.334s
> sys 0m0.470s
>
> On AWS:
> *$ time nodetool version*
> ReleaseVersion: 3.0.3
>
> real 0m*1.084s*
> user 0m1.772s
> sys 0m0.363s
>
> Any way by which its speed can be increased?
>
> Thanks & Regards,
> Bhuvan
>

Re: Error while rebuilding a node: Stream failed

2016-05-27 Thread Sebastian Estevez

Check ifconfig for dripped tpc messages. Let's rule out your network.

all the best,

Sebastián
On May 27, 2016 10:45 AM, "George Sigletos"  wrote:

> Hello,
>
> No there is no version mix. The first stack traces were indeed from
> 2.1.13. Then I upgraded all nodes to 2.1.14. Still getting the same errors
>
>
> On Fri, May 27, 2016 at 4:39 PM, Eric Evans 
> wrote:
>
>> From the various stacktraces in this thread, it's obvious you are
>> mixing versions 2.1.13 and 2.1.14.  Topology changes like this aren't
>> supported with mixed Cassandra versions.  Sometimes it will work,
>> sometimes it won't (and it will definitely not work in this instance).
>>
>> You should either upgrade your 2.1.13 nodes to 2.1.14 first, or add
>> the new nodes using 2.1.13, and upgrade after.
>>
>> On Fri, May 27, 2016 at 8:41 AM, George Sigletos 
>> wrote:
>>
>>  ERROR [STREAM-IN-/192.168.1.141] 2016-05-26 09:08:05,027
>>  StreamSession.java:505 - [Stream
>> #74c57bc0-231a-11e6-a698-1b05ac77baf9]
>>  Streaming error occurred
>>  java.lang.RuntimeException: Outgoing stream handler has been closed
>>  at
>> 
>> org.apache.cassandra.streaming.ConnectionHandler.sendMessage(ConnectionHandler.java:138)
>>  ~[apache-cassandra-2.1.14.jar:2.1.14]
>>  at
>> 
>> org.apache.cassandra.streaming.StreamSession.receive(StreamSession.java:568)
>>  ~[apache-cassandra-2.1.14.jar:2.1.14]
>>  at
>> 
>> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:457)
>>  ~[apache-cassandra-2.1.14.jar:2.1.14]
>>  at
>> 
>> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:263)
>>  ~[apache-cassandra-2.1.14.jar:2.1.14]
>>  at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
>> 
>>  And this is from the source node:
>> 
>>  ERROR [STREAM-OUT-/172.31.22.104] 2016-05-26 11:08:05,097
>>  StreamSession.java:505 - [Stream
>> #74c57bc0-231a-11e6-a698-1b05ac77baf9]
>>  Streaming error occurred
>>  java.io.IOException: Broken pipe
>>  at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
>>  ~[na:1.7.0_79]
>>  at sun.nio.ch.FileChannelImpl.transferToDirectly(Unknown
>> Source)
>>  ~[na:1.7.0_79]
>>  at sun.nio.ch.FileChannelImpl.transferTo(Unknown Source)
>>  ~[na:1.7.0_79]
>>  at
>> 
>> org.apache.cassandra.streaming.compress.CompressedStreamWriter.write(CompressedStreamWriter.java:84)
>>  ~[apache-cassandra-2.1.14.jar:2.1.14]
>>  at
>> 
>> org.apache.cassandra.streaming.messages.OutgoingFileMessage.serialize(OutgoingFileMessage.java:88)
>>  ~[apache-cassandra-2.1.14.jar:2.1.14]
>>  at
>> 
>> org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:49)
>>  ~[apache-cassandra-2.1.14.jar:2.1.14]
>>  at
>> 
>> org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:41)
>>  ~[apache-cassandra-2.1.14.jar:2.1.14]
>>  at
>> 
>> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:45)
>>  ~[apache-cassandra-2.1.14.jar:2.1.14]
>>  at
>> 
>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:358)
>>  [apache-cassandra-2.1.14.jar:2.1.14]
>>  at
>> 
>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:330)
>>  [apache-cassandra-2.1.14.jar:2.1.14]
>>
>>
>> >>> ERROR [STREAM-IN-/192.168.1.140] 2016-05-24 22:44:57,704
>> >>> StreamSession.java:620 - [Stream
>> #2c290460-20d4-11e6-930f-1b05ac77baf9]
>> >>> Remote peer 192.168.1.140 failed stream session.
>> >>> ERROR [STREAM-OUT-/192.168.1.140] 2016-05-24 22:44:57,705
>> >>> StreamSession.java:505 - [Stream
>> #2c290460-20d4-11e6-930f-1b05ac77baf9]
>> >>> Streaming error occurred
>> >>> java.io.IOException: Connection timed out
>> >>> at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
>> >>> ~[na:1.7.0_79]
>> >>> at sun.nio.ch.SocketDispatcher.write(Unknown Source)
>> >>> ~[na:1.7.0_79]
>> >>> at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown
>> >>> Source) ~[na:1.7.0_79]
>> >>> at sun.nio.ch.IOUtil.write(Unknown Source)
>> ~[na:1.7.0_79]
>> >>> at sun.nio.ch.SocketChannelImpl.write(Unknown Source)
>> >>> ~[na:1.7.0_79]
>> >>> at
>> >>>
>> org.apache.cassandra.io.util.DataOutputStreamAndChannel.write(DataOutputStreamAndChannel.java:48)
>> >>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>> >>> at
>> >>>
>> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44)
>> >>> ~

Re: A question to 'paging' support in DataStax java driver

2016-05-10 Thread Sebastian Estevez

I didn't read the whole thread last time around, please disregard my
comment about the java driver jira.

One other thought (hopefully relevant this time). Once we have
https://issues.apache.org/jira/browse/CASSANDRA-10783, you could write a
write a (*start*, *rows*) style paging UDF which would allow you to read
just page 4 for example. Granted you will still have to *scan* the data
from 0 to start at the server and throw it away, but might get you closer
to what you are looking for.




All the best,


[image: datastax_logo.png] <http://www.datastax.com/>

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png] <https://www.linkedin.com/company/datastax> [image:
facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
<https://twitter.com/datastax> [image: g+.png]
<https://plus.google.com/+Datastax/about>
<http://feeds.feedburner.com/datastax>
<http://goog_410786983>


<http://www.datastax.com/gartner-magic-quadrant-odbms>

DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Tue, May 10, 2016 at 9:23 AM, Sebastian Estevez <
sebastian.este...@datastax.com> wrote:

> I think this request belongs in the java driver jira not the Cassandra
> jira.
>
> https://datastax-oss.atlassian.net/projects/JAVA/
>
> all the best,
>
> Sebastián
> On May 10, 2016 1:09 AM, "Lu, Boying"  wrote:
>
>> I filed a JIRA https://issues.apache.org/jira/browse/CASSANDRA-11741 to
>> track this.
>>
>>
>>
>> *From:* DuyHai Doan [mailto:doanduy...@gmail.com]
>> *Sent:* 2016年5月10日 12:47
>> *To:* user@cassandra.apache.org
>> *Subject:* Re: A question to 'paging' support in DataStax java driver
>>
>>
>>
>> I guess it's technically possible but then we'll need to update the
>> binary protocol. Just create a JIRA and ask for this feature
>>
>>
>>
>> On Tue, May 10, 2016 at 5:00 AM, Lu, Boying  wrote:
>>
>> Thanks very much.
>>
>>
>>
>> I understand that the data needs to be read from the DB to get the next
>> ‘PagingState’.
>>
>>
>>
>> But is it possible not to return those data to the client side, just
>> returning the ‘PagingState’?
>>
>> I.e. the data is read on the server side, but not return to client side,
>> this can save some bandwidth
>>
>> between client and server.
>>
>>
>>
>>
>>
>> *From:* DuyHai Doan [mailto:doanduy...@gmail.com]
>> *Sent:* 2016年5月9日 21:06
>> *To:* user@cassandra.apache.org
>> *Subject:* Re: A question to 'paging' support in DataStax java driver
>>
>>
>>
>> In a truly consistent world (should I say "snapshot isolation" world
>> instead), re-reading the same page should yield the same results no matter
>> how many new inserts have occurred since the last page read.
>>
>>
>>
>> Caching previous page at app level can be a solution but not viable if
>> the amount of data is huge, also you'll need a cache layer and deal with
>> cache invalidation etc ...
>>
>>
>>
>> The point is, providing snapshot isolation in a distributed system is
>> hard without some sort of synchronous coordination e.g. global lock (read
>> http://www.bailis.org/papers/hat-vldb2014.pdf)
>>
>>
>>
>>
>>
>> On Mon, May 9, 2016 at 2:17 PM, Bhuvan Rawal  wrote:
>>
>> Hi Doan,
>>
>>
>>
>> What does it have to do being eventual consistency? Lets assume a
>> scenario with complete consistency and we are at page X, and at the same
>> time some inserts/updates happened at page X-2 and we jumped to that.
>>
>> User will see inconsistent page in that case as well, right? Also in such
>> cases how would you design a user facing application (Cache previous pages
>> at app level?)
>>
>>
>>
>> Regards,
>>
>> Bhuvan
>>
>>
>>
>> On Mon, May 9, 2016 at 4:18 PM, DuyHai Doan  wrote:
>>
>> "Is it possible to just return PagingState object without returning
>> data?" --> No
>>
>>
>>
>> Simply because before reading the actual data for each page of N rows,
>> you cannot know at wh

RE: A question to 'paging' support in DataStax java driver

2016-05-10 Thread Sebastian Estevez

I think this request belongs in the java driver jira not the Cassandra jira.

https://datastax-oss.atlassian.net/projects/JAVA/

all the best,

Sebastián
On May 10, 2016 1:09 AM, "Lu, Boying"  wrote:

> I filed a JIRA https://issues.apache.org/jira/browse/CASSANDRA-11741 to
> track this.
>
>
>
> *From:* DuyHai Doan [mailto:doanduy...@gmail.com]
> *Sent:* 2016年5月10日 12:47
> *To:* user@cassandra.apache.org
> *Subject:* Re: A question to 'paging' support in DataStax java driver
>
>
>
> I guess it's technically possible but then we'll need to update the binary
> protocol. Just create a JIRA and ask for this feature
>
>
>
> On Tue, May 10, 2016 at 5:00 AM, Lu, Boying  wrote:
>
> Thanks very much.
>
>
>
> I understand that the data needs to be read from the DB to get the next
> ‘PagingState’.
>
>
>
> But is it possible not to return those data to the client side, just
> returning the ‘PagingState’?
>
> I.e. the data is read on the server side, but not return to client side,
> this can save some bandwidth
>
> between client and server.
>
>
>
>
>
> *From:* DuyHai Doan [mailto:doanduy...@gmail.com]
> *Sent:* 2016年5月9日 21:06
> *To:* user@cassandra.apache.org
> *Subject:* Re: A question to 'paging' support in DataStax java driver
>
>
>
> In a truly consistent world (should I say "snapshot isolation" world
> instead), re-reading the same page should yield the same results no matter
> how many new inserts have occurred since the last page read.
>
>
>
> Caching previous page at app level can be a solution but not viable if the
> amount of data is huge, also you'll need a cache layer and deal with cache
> invalidation etc ...
>
>
>
> The point is, providing snapshot isolation in a distributed system is hard
> without some sort of synchronous coordination e.g. global lock (read
> http://www.bailis.org/papers/hat-vldb2014.pdf)
>
>
>
>
>
> On Mon, May 9, 2016 at 2:17 PM, Bhuvan Rawal  wrote:
>
> Hi Doan,
>
>
>
> What does it have to do being eventual consistency? Lets assume a scenario
> with complete consistency and we are at page X, and at the same time some
> inserts/updates happened at page X-2 and we jumped to that.
>
> User will see inconsistent page in that case as well, right? Also in such
> cases how would you design a user facing application (Cache previous pages
> at app level?)
>
>
>
> Regards,
>
> Bhuvan
>
>
>
> On Mon, May 9, 2016 at 4:18 PM, DuyHai Doan  wrote:
>
> "Is it possible to just return PagingState object without returning
> data?" --> No
>
>
>
> Simply because before reading the actual data for each page of N rows, you
> cannot know at which token value a page of data starts...
>
>
>
> And it is worst than that, with paging you don't have any isolation. Let's
> suppose you keep in your application/web front-end the paging states for
> page 1, 2 and 3. Since there are concurrent inserts on the cluster at the
> same time, when you re-use the paging state 2 for example, you may not get
> the same results as the previous read.
>
>
>
> And it is inevitable in an eventual consistent distributed DB world
>
>
>
> On Mon, May 9, 2016 at 12:25 PM, Lu, Boying  wrote:
>
> dHi, All,
>
>
>
> We are considering to use DataStax java driver in our codes. One important
> feature provided by the driver we want to use is ‘paging’.
>
> But according to the
> https://datastax.github.io/java-driver/3.0.0/manual/paging/, it seems
> that we can’t jump between pages.
>
>
>
> Is it possible to just return PagingState object without returning data?
> e.g.  If I want to jump to the page 5 from the page 1,
>
> I need to go through each page from page 1 to page 5,  Is it possible to
> just return the PagingState object of page 1, 2, 3 and 4 without
>
> actual data of each page? This can save some bandwidth at least.
>
>
>
> Thanks in advance.
>
>
>
> Boying
>
>
>
>
>
>
>
>
>
>
>
>
>

Re: Cassandra causing OOM Killer to strike on new cluster running 3.4

2016-03-11 Thread Sebastian Estevez

Sacrifice child in dmesg is your OS killing the process with the most ram.
That means you're actually running out of memory at the Linux level outside
of the JVM.

Are you running anything other than Cassandra on this box?

If so, does it have a memory leak?

all the best,

Sebastián
On Mar 11, 2016 11:14 AM, "Adam Plumb"  wrote:

> I've got a new cluster of 18 nodes running Cassandra 3.4 that I just
> launched and loaded data into yesterday (roughly 2TB of total storage) and
> am seeing runaway memory usage.  These nodes are EC2 c3.4xlarges with 30GB
> RAM and the heap size is set to 8G with a new heap size of 1.6G.
>
> Last night I finished loading up the data, then ran an incremental repair
> on one of the nodes just to ensure that everything was working (nodetool
> repair).  Over night all 18 nodes ran out of memory and were killed by the
> OOM killer.  I restarted them this morning and they all came up fine, but
> just started churning through memory and got killed again.  I restarted
> them again and they're doing the same thing.  I'm not getting any errors in
> the system log, since the process is getting killed abruptly (which makes
> me think this is a native memory issue, not heap)
>
> Obviously this behavior isn't the best.  I'm willing to provide any data
> people need to help debug this, these nodes are still up and running.  I'm
> also in IRC if anyone wants to jump on there.
>
> Here is the output of ps aux:
>
> 497   64351  108 89.5 187156072 27642988 ?  SLl  15:13  62:15 java -ea
>> -XX:+CMSClassUnloadingEnabled -XX:+UseThreadPriorities
>> -XX:ThreadPriorityPolicy=42 -Xms7536M -Xmx7536M -Xmn1600M
>> -XX:+HeapDumpOnOutOfMemoryError -Xss256k -XX:StringTableSize=103
>> -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled
>> -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1
>> -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly
>> -XX:+UseTLAB -XX:MaxGCPauseMillis=200 -XX:InitiatingHeapOccupancyPercent=45
>> -XX:-ParallelRefProcEnabled -XX:-AlwaysPreTouch -XX:+UseBiasedLocking
>> -XX:+UseTLAB -XX:+ResizeTLAB -Djava.net.preferIPv4Stack=true
>> -Dcom.sun.management.jmxremote.port=7199
>> -Dcom.sun.management.jmxremote.rmi.port=7199
>> -Dcom.sun.management.jmxremote.ssl=false
>> -Dcom.sun.management.jmxremote.authenticate=false
>> -XX:+CMSClassUnloadingEnabled -Dlogback.configurationFile=logback.xml -D
>> *cas*sandra.logdir=/usr/local/*cas*sandra/logs -D*cas*
>> sandra.storagedir=/usr/local/*cas*sandra/data -D*cas*
>> sandra-pidfile=/var/run/*cas*sandra/*cas*sandra.pid -cp /usr/local/*cas*
>> sandra/conf:/usr/local/*cas*sandra/build/classes/main:/usr/local/*cas*
>> sandra/build/classes/thrift:/usr/local/*cas*
>> sandra/lib/airline-0.6.jar:/usr/local/*cas*
>> sandra/lib/antlr-runtime-3.5.2.jar:/usr/local/*cas*sandra/lib/apache-
>> *cas*sandra-3.4.jar:/usr/local/*cas*sandra/lib/apache-*cas*
>> sandra-clientutil-3.4.jar:/usr/local/*cas*sandra/lib/apache-*cas*
>> sandra-thrift-3.4.jar:/usr/local/*cas*
>> sandra/lib/asm-5.0.4.jar:/usr/local/*cas*sandra/lib/*cas*
>> sandra-driver-core-3.0.0-shaded.jar:/usr/local/*ca*
>> sandra/lib/commons-cli-1.1.jar:/usr/local/*cas*
>> sandra/lib/commons-codec-1.2.jar:/usr/local/*cas*
>> sandra/lib/commons-lang3-3.1.jar:/usr/local/*cas*
>> sandra/lib/commons-math3-3.2.jar:/usr/local/*cas*
>> sandra/lib/compress-lzf-0.8.4.jar:/usr/local/*cas*
>> sandra/lib/concurrentlinkedhashmap-lru-1.4.jar:/usr/local/*cas*
>> sandra/lib/concurrent-trees-2.4.0.jar:/usr/local/*cas*
>> sandra/lib/disruptor-3.0.1.jar:/usr/local/*cas*
>> sandra/lib/ecj-4.4.2.jar:/usr/local/*cas*
>> sandra/lib/guava-18.0.jar:/usr/local/*cas*
>> sandra/lib/high-scale-lib-1.0.6.jar:/usr/local/*cas*
>> sandra/lib/hppc-0.5.4.jar:/usr/local/*cas*
>> sandra/lib/jackson-core-asl-1.9.2.jar:/usr/local/*cas*
>> sandra/lib/jackson-mapper-asl-1.9.2.jar:/usr/local/*cas*
>> sandra/lib/jamm-0.3.0.jar:/usr/local/*cas*
>> sandra/lib/javax.inject.jar:/usr/local/*cas*
>> sandra/lib/jbcrypt-0.3m.jar:/usr/local/*cas*
>> sandra/lib/jcl-over-slf4j-1.7.7.jar:/usr/local/*cas*
>> sandra/lib/jflex-1.6.0.jar:/usr/local/*cas*
>> sandra/lib/jna-4.0.0.jar:/usr/local/*cas*
>> sandra/lib/joda-time-2.4.jar:/usr/local/*cas*
>> sandra/lib/json-simple-1.1.jar:/usr/local/*cas*
>> sandra/lib/libthrift-0.9.2.jar:/usr/local/*cas*
>> sandra/lib/log4j-over-slf4j-1.7.7.jar:/usr/local/*cas*
>> sandra/lib/logback-classic-1.1.3.jar:/usr/local/*cas*
>> sandra/lib/logback-core-1.1.3.jar:/usr/local/*cas*
>> sandra/lib/lz4-1.3.0.jar:/usr/local/*cas*
>> sandra/lib/metrics-core-3.1.0.jar:/usr/local/*cas*
>> sandra/lib/metrics-logback-3.1.0.jar:/usr/local/*cas*
>> sandra/lib/netty-all-4.0.23.Final.jar:/usr/local/*cas*
>> sandra/lib/ohc-core-0.4.2.jar:/usr/local/*cas*
>> sandra/lib/ohc-core-j8-0.4.2.jar:/usr/local/*cas*
>> sandra/lib/primitive-1.0.jar:/usr/local/*cas*
>> sandra/lib/reporter-config3-3.0.0.jar:/usr/local/*cas*
>> sandra/lib/reporter-config-base-3.0.0.jar:/usr/local/*cas*
>> sandra/lib/sigar-1.6.4.jar:/

Re: How to measure the write amplification of C*?

2016-03-10 Thread Sebastian Estevez

https://issues.apache.org/jira/browse/CASSANDRA-10805

All the best,


[image: datastax_logo.png] 

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png]  [image:
facebook.png]  [image: twitter.png]
 [image: g+.png]







DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Thu, Mar 10, 2016 at 1:10 PM, Jeff Ferland  wrote:

> Compaction logs show the number of bytes written and the level written to.
> Base write load = table flushed to L0.
> Write amplification = sum of all compactions written to disk for the table.
>
> On Thu, Mar 10, 2016 at 9:44 AM, Dikang Gu  wrote:
>
>> Hi Matt,
>>
>> Thanks for the detailed explanation! Yes, this is exactly what I'm
>> looking for, "write amplification = data written to flash/data written
>> by the host".
>>
>> We are heavily using the LCS in production, so I'd like to figure out the
>> amplification caused by that and see what we can do to optimize it. I have
>> the metrics of "data written to flash", and I'm wondering is there an
>> easy way to get the "data written by the host" on each C* node?
>>
>> Thanks
>>
>> On Thu, Mar 10, 2016 at 8:48 AM, Matt Kennedy 
>> wrote:
>>
>>> TL;DR - Cassandra actually causes a ton of write amplification but it
>>> doesn't freaking matter any more. Read on for details...
>>>
>>> That slide deck does have a lot of very good information on it, but
>>> unfortunately I think it has led to a fundamental misunderstanding about
>>> Cassandra and write amplification. In particular, slide 51 vastly
>>> oversimplifies the situation.
>>>
>>> The wikipedia definition of write amplification looks at this from the
>>> perspective of the SSD controller:
>>> https://en.wikipedia.org/wiki/Write_amplification#Calculating_the_value
>>>
>>> In short, write amplification = data written to flash/data written by
>>> the host
>>>
>>> So, if I write 1MB in my application, but the SSD has to write my 1MB,
>>> plus rearrange another 1MB of data in order to make room for it, then I've
>>> written a total of 2MB and my write amplification is 2x.
>>>
>>> In other words, it is measuring how much extra the SSD controller has to
>>> write in order to do its own housekeeping.
>>>
>>> However, the wikipedia definition is a bit more constrained than how the
>>> term is used in the storage industry. The whole point of looking at write
>>> amplification is to understand the impact that a particular workload is
>>> going to have on the underlying NAND by virtue of the data written. So a
>>> definition of write amplification that is a little more relevant to the
>>> context of Cassandra is to consider this:
>>>
>>> write amplification = data written to flash/data written to the database
>>>
>>> So, while the fact that we only sequentially write large immutable
>>> SSTables does in fact mean that controller-level write amplification is
>>> near zero, Compaction comes along and completely destroys that tidy little
>>> story. Think about it, every time a compaction re-writes data that has
>>> already been written, we are creating a lot of application-level write
>>> amplification. Different compaction strategies and the workload itself
>>> impact what the real application-level write amp is, but generally
>>> speaking, LCS is the worst, followed by STCS and DTCS will cause the least
>>> write-amp. To measure this, you can usually use smartctl (may be another
>>> mechanism depending on SSD manufacturer) to get the physical bytes written
>>> to your SSDs and divide that by the data that you've actually logically
>>> written to Cassandra. I've measured (more than two years ago) LCS write amp
>>> as high as 50x on some workloads, which is significantly higher than the
>>> typical controller level write amp on a b-tree style update-in-place data
>>> store. Also note that the new storage engine in general reduces a lot of
>>> inefficiency in the Cassandra storage engine therefore reducing the impact
>>> of write amp due to compactions.
>>>
>>> However, if you're a person that understands SSDs, at this point you're
>>> wondering why we aren't burning out SSDs right and left. The reality is
>>> that general SSD endurance has gotten so good, that all this write amp
>>> isn't really a problem any more. If you're curious to read more about that,
>>> I recommend you start h

Re: How to complete bootstrap with exception due to stream failure?

2016-02-27 Thread Sebastian Estevez

progress: 361% does not look right (probably a bug).

Can you check the corresponding messages on the other side of the stream?
I.E. the system log for 192.168.10.8 around 18:02:06?

All the best,

[image: datastax_logo.png] 

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png]  [image:
facebook.png]  [image: twitter.png]
 [image: g+.png]

DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Sat, Feb 27, 2016 at 6:12 PM, Jason Kania  wrote:

> Hello,
>
> I am trying to get a node bootstrapped in 3.0.3, but just at the point
> where the bootstrap process is to complete, a broken pipe exception occurs
> so the bootstrap process hangs. Once I kill the bootstrap process, I can
> execute "nodetool bootstrap resume" again and the same problem will occur
> just at the end of the bootstrap exercise. Here is the tail of the log:
>
> [2016-02-27 18:02:05,898] received file
> /home/cassandra/data/sensordb/listedAttributes-7925634011e59f707d76a8de8480/ma-30-big-Data.db
> (progress: 357%)
> [2016-02-27 18:02:06,479] received file
> /home/cassandra/data/sensordb/notification-f7e3eaa0024b11e5bb310d2316086bf7/ma-38-big-Data.db
> (progress: 361%)
> [2016-02-27 18:02:06,884] session with /192.168.10.8 complete (progress:
> 361%)
> [2016-02-27 18:02:06,886] Stream failed
>
> I attempted to run nodetool repair, but get the following which I have
> been told indicates that the replication factor is 1:
>
> root@bull:~# nodetool repair
> [2016-02-27 18:04:55,083] Nothing to repair for keyspace 'sensordb'
>
> Thanks,
>
> Jason
>

Re: Keyspaces not found in cqlsh

2016-02-11 Thread Sebastian Estevez

>
> On restart of one node 1 could see repeated errors like " Mutation of
> 22076203 bytes is too large for the maxiumum size of 16777216"


Commitlog segment size is the right lever to get C* to accept larger writes
but this is not a traditional use for cassandra. Cassandra is built to
handle lots and lots of small writes, not huge ones. I envision you will
have other pains as a result of your large mutations. If you want to write
huge values consider chunking them up into smaller writes.

Followed by a rolling restart again. And now there is a single version and
> keyspaces are back in cqlsh.


Good, your original problem was due to the schema disagreement issue and
the rolling restart solved it.



All the best,


[image: datastax_logo.png] <http://www.datastax.com/>

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png] <https://www.linkedin.com/company/datastax> [image:
facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
<https://twitter.com/datastax> [image: g+.png]
<https://plus.google.com/+Datastax/about>
<http://feeds.feedburner.com/datastax>
<http://goog_410786983>


<http://www.datastax.com/gartner-magic-quadrant-odbms>

DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Thu, Feb 11, 2016 at 10:41 AM, kedar  wrote:

> Thanks a ton Sebastian.
>
> On restart of one node 1 could see repeated errors like " Mutation of
> 22076203 bytes is too large for the maxiumum size of 16777216"
>
> So I increased commitlog_segment_size_in_mb from 32 to 64mb.
>
> Followed by a rolling restart again. And now there is a single version and
> keyspaces are back in cqlsh.
>
> Cluster Information:
> Name: Test Cluster
> Snitch: org.apache.cassandra.locator.DynamicEndpointSnitch
> Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
> Schema versions:
> 9f5b5675-c9e7-3ae3-8ad6-6654fa4fb3e7: [IP1, IP2]
>
> On the flip side what could be implications of increasing
> commitlog_segment_size_in_mb
>
> Thanks,
> Kedar Parikh
>
> On Thursday 11 February 2016 08:18 PM, Sebastian Estevez wrote:
>
> If its a tarball then root should be fine but there were some files owned
> by the Cassandra user so you may want to chown those back to root.
>
> I haven't seen your exact issue before but you have two schema versions
> from your describe cluster so a rolling restart should help.
>
> all the best,
>
> Sebastián
> On Feb 11, 2016 9:28 AM, "kedar" < 
> kedar.par...@netcore.co.in> wrote:
>
>> Thanks Sebastian,
>>
>> Cassandra installation in our case is simply an untar.
>>
>> Cassandra is started using supervisord and user as root, would you still
>> recommend I try using Cassandra user.
>>
>>  ./nodetool describecluster
>> Cluster Information:
>> Name: Test Cluster
>> Snitch: org.apache.cassandra.locator.DynamicEndpointSnitch
>> Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
>> Schema versions:
>> cd361577-6947-3390-a787-be28fe499787: [Ip1]
>>
>> 9f5b5675-c9e7-3ae3-8ad6-6654fa4fb3e7: [Ip2]
>>
>> Interestingly
>> ./nodetool cfstats shows all the tables
>>
>> Thanks,
>> Kedar Parikh
>>
>> On Thursday 11 February 2016 07:34 PM, Sebastian Estevez wrote:
>>
>> Keep this on. The user list, it's not appropriate for the dev list.
>>
>> 1) I noticed that some of your files are owned by root and others by
>> Cassandra. If this is a package install you should always start C* as a
>> service and chown your files and directories so they are owned by the
>> Cassandra user, not root.  Never start Cassandra directly as root.
>>
>> 2) Once you have fixed your file ownerships, restart Cassandra on each
>> node one at a time. You should see your sstables and commitlog get picked
>> up by Cassandra in the the system.log on startup. Share the output of
>> 'nodetool describecluster' before and after.
>>
>> all the best,
>>
>> Sebastián
>> On Feb 11, 2016 6:30 AM, "kedar"  wrote:
>>
>>> Thanks,
>>>
>>> kindly refer the following:
>>>
>>> https://gist.github.com/anonymous/3dddbe728a52c07d7c52
>>> https://gi

Re: OpsCenter 5.2

2016-02-11 Thread Sebastian Estevez

Confirmed.

all the best,

Sebastián
On Feb 11, 2016 12:44 PM, "Ted Yu"  wrote:

> Thanks for the pointer.
>
> Just want to confirm that OpsCenter 5.2 is compatible with DSE 4.8.4 which
> I have deployed.
>
> Cheers
>
> On Thu, Feb 11, 2016 at 7:00 AM, Sebastian Estevez <
> sebastian.este...@datastax.com> wrote:
>
>> The monitoring UI is called DataStax OpsCenter and it has its own install
>> process.
>>
>> Check out our documentation on the subject:
>>
>>
>> http://docs.datastax.com/en/opscenter/5.2/opsc/install/opscInstallOpsc_g.html
>>
>> all the best,
>>
>> Sebastián
>> On Feb 9, 2016 8:01 PM, "Ted Yu"  wrote:
>>
>>> Hi,
>>> I am using DSE 4.8.4
>>> Here are the ports Cassandra daemon listens on:
>>>
>>> tcp0  0 xx.yy:9042  0.0.0.0:*
>>> LISTEN  30773/java
>>> tcp0  0 127.0.0.1:56498 0.0.0.0:*
>>> LISTEN  30773/java
>>> tcp0  0 xx.yy:7000  0.0.0.0:*
>>> LISTEN  30773/java
>>> tcp0  0 127.0.0.1:7199  0.0.0.0:*
>>> LISTEN  30773/java
>>> tcp0  0 xx.yy:9160  0.0.0.0:*
>>> LISTEN  30773/java
>>>
>>> Can you tell me how I can get to the DSE monitoring UI ?
>>>
>>> Thanks
>>>
>>
>

Re: stefania.alborghe...@datastax.com

2016-02-11 Thread Sebastian Estevez

The monitoring UI is called DataStax OpsCenter and it has its own install
process.

Check out our documentation on the subject:

http://docs.datastax.com/en/opscenter/5.2/opsc/install/opscInstallOpsc_g.html

all the best,

Sebastián
On Feb 9, 2016 8:01 PM, "Ted Yu"  wrote:

> Hi,
> I am using DSE 4.8.4
> Here are the ports Cassandra daemon listens on:
>
> tcp0  0 xx.yy:9042  0.0.0.0:*
> LISTEN  30773/java
> tcp0  0 127.0.0.1:56498 0.0.0.0:*
>   LISTEN  30773/java
> tcp0  0 xx.yy:7000  0.0.0.0:*
> LISTEN  30773/java
> tcp0  0 127.0.0.1:7199  0.0.0.0:*
>   LISTEN  30773/java
> tcp0  0 xx.yy:9160  0.0.0.0:*
> LISTEN  30773/java
>
> Can you tell me how I can get to the DSE monitoring UI ?
>
> Thanks
>

Re: Keyspaces not found in cqlsh

2016-02-11 Thread Sebastian Estevez

If its a tarball then root should be fine but there were some files owned
by the Cassandra user so you may want to chown those back to root.

I haven't seen your exact issue before but you have two schema versions
from your describe cluster so a rolling restart should help.

all the best,

Sebastián
On Feb 11, 2016 9:28 AM, "kedar"  wrote:

> Thanks Sebastian,
>
> Cassandra installation in our case is simply an untar.
>
> Cassandra is started using supervisord and user as root, would you still
> recommend I try using Cassandra user.
>
>  ./nodetool describecluster
> Cluster Information:
> Name: Test Cluster
> Snitch: org.apache.cassandra.locator.DynamicEndpointSnitch
> Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
> Schema versions:
> cd361577-6947-3390-a787-be28fe499787: [Ip1]
>
> 9f5b5675-c9e7-3ae3-8ad6-6654fa4fb3e7: [Ip2]
>
> Interestingly
> ./nodetool cfstats shows all the tables
>
> Thanks,
> Kedar Parikh
>
> On Thursday 11 February 2016 07:34 PM, Sebastian Estevez wrote:
>
> Keep this on. The user list, it's not appropriate for the dev list.
>
> 1) I noticed that some of your files are owned by root and others by
> Cassandra. If this is a package install you should always start C* as a
> service and chown your files and directories so they are owned by the
> Cassandra user, not root.  Never start Cassandra directly as root.
>
> 2) Once you have fixed your file ownerships, restart Cassandra on each
> node one at a time. You should see your sstables and commitlog get picked
> up by Cassandra in the the system.log on startup. Share the output of
> 'nodetool describecluster' before and after.
>
> all the best,
>
> Sebastián
> On Feb 11, 2016 6:30 AM, "kedar"  wrote:
>
>> Thanks,
>>
>> kindly refer the following:
>>
>> https://gist.github.com/anonymous/3dddbe728a52c07d7c52
>> https://gist.github.com/anonymous/302ade0875dd6410087b
>>
>> Thanks,
>> Kedar Parikh
>>
>> <http://www.netcore.co.in>
>>
>> On Thursday 11 February 2016 04:35 PM, Romain Hardouin wrote:
>>
>>> Would you mind pasting the ouput for both nodes in gist/paste/whatever?
>>> https://gist.github.com http://paste.debian.net
>>>
>>>
>>>
>>> Le Jeudi 11 février 2016 11h57, kedar  a
>>> écrit :
>>> Thanks for the reply.
>>>
>>> ls -l cassandra/data/* lists various *.db files
>>>
>>> This problem is on both nodes.
>>>
>>> Thanks,
>>> Kedar Parikh
>>>
>>> <http://www.netcore.co.in>
>>>
>>>
>>>
>>>
>>
>>
>>
>

Re: Keyspaces not found in cqlsh

2016-02-11 Thread Sebastian Estevez

Keep this on. The user list, it's not appropriate for the dev list.

1) I noticed that some of your files are owned by root and others by
Cassandra. If this is a package install you should always start C* as a
service and chown your files and directories so they are owned by the
Cassandra user, not root.  Never start Cassandra directly as root.

2) Once you have fixed your file ownerships, restart Cassandra on each node
one at a time. You should see your sstables and commitlog get picked up by
Cassandra in the the system.log on startup. Share the output of 'nodetool
describecluster' before and after.

all the best,

Sebastián
On Feb 11, 2016 6:30 AM, "kedar"  wrote:

> Thanks,
>
> kindly refer the following:
>
> https://gist.github.com/anonymous/3dddbe728a52c07d7c52
> https://gist.github.com/anonymous/302ade0875dd6410087b
>
> Thanks,
> Kedar Parikh
>
> Ext : 2224
> Dir : +91 22 61782224
> Mob : +91 9819634734
> Email : kedar.par...@netcore.co.in
> Web : www.netcore.co.in
>
> On Thursday 11 February 2016 04:35 PM, Romain Hardouin wrote:
>
>> Would you mind pasting the ouput for both nodes in gist/paste/whatever?
>> https://gist.github.com http://paste.debian.net
>>
>>
>>
>> Le Jeudi 11 février 2016 11h57, kedar  a
>> écrit :
>> Thanks for the reply.
>>
>> ls -l cassandra/data/* lists various *.db files
>>
>> This problem is on both nodes.
>>
>> Thanks,
>> Kedar Parikh
>>
>> Ext : 2224
>> Dir : +91 22 61782224
>> Mob : +91 9819634734
>> Email : kedar.par...@netcore.co.in
>> Web : www.netcore.co.in
>>
>>
>>
>>
>
>
>

Re: EC2 storage options for C*

2016-02-03 Thread Sebastian Estevez

By the way, if someone wants to do some hard core testing like Al, I wrote
a guide on how to use his tool:

http://www.sestevez.com/how-to-use-toberts-effio/

I'm sure folks on this list would like to see more stats : )

All the best,


[image: datastax_logo.png] <http://www.datastax.com/>

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png] <https://www.linkedin.com/company/datastax> [image:
facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
<https://twitter.com/datastax> [image: g+.png]
<https://plus.google.com/+Datastax/about>
<http://feeds.feedburner.com/datastax>
<http://goog_410786983>


<http://www.datastax.com/gartner-magic-quadrant-odbms>

DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Wed, Feb 3, 2016 at 7:27 PM, Sebastian Estevez <
sebastian.este...@datastax.com> wrote:

> Good points Bryan, some more color:
>
> Regular EBS is *not* okay for C*. But AWS has some nicer EBS now that has
> performed okay recently:
>
> http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSVolumeTypes.html
>
> https://www.youtube.com/watch?v=1R-mgOcOSd4
>
>
> The cloud vendors are moving toward shared storage and we can't ignore
> that in the long term (they will push us in that direction financially).
> Fortunately their shared storage offerings are also getting better. For
> example google's elastic storage offerring provides very reliable
> latencies  <https://www.youtube.com/watch?v=qf-7IhCqCcI>which is what we
> care the most about, not iops.
>
> On the practical side, a key thing I've noticed with real deployments is
> that the size of the volume affects how fast it will perform and how stable
> it's latencies will be so make sure to get large EBS volumes > 1tb to get
> decent performance, even if your nodes aren't that dense.
>
>
>
>
> All the best,
>
>
> [image: datastax_logo.png] <http://www.datastax.com/>
>
> Sebastián Estévez
>
> Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com
>
> [image: linkedin.png] <https://www.linkedin.com/company/datastax> [image:
> facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
> <https://twitter.com/datastax> [image: g+.png]
> <https://plus.google.com/+Datastax/about>
> <http://feeds.feedburner.com/datastax>
> <http://goog_410786983>
>
>
> <http://www.datastax.com/gartner-magic-quadrant-odbms>
>
> DataStax is the fastest, most scalable distributed database technology,
> delivering Apache Cassandra to the world’s most innovative enterprises.
> Datastax is built to be agile, always-on, and predictably scalable to any
> size. With more than 500 customers in 45 countries, DataStax is the
> database technology and transactional backbone of choice for the worlds
> most innovative companies such as Netflix, Adobe, Intuit, and eBay.
>
> On Wed, Feb 3, 2016 at 7:23 PM, Bryan Cheng  wrote:
>
>> From my experience, EBS has transitioned from "stay the hell away" to
>> "OK" as the new GP2 SSD type has come out and stabilized over the last few
>> years, especially with the addition of EBS-optimized instances that have
>> dedicated EBS bandwidth. The latter has really helped to stabilize the
>> problematic 99.9-percentile latency spikes that use to plague EBS volumes.
>>
>> EBS (IMHO) has always had operational advantages, but inconsistent
>> latency and generally poor performance in the past lead many to disregard
>> it.
>>
>> On Wed, Feb 3, 2016 at 4:09 PM, James Rothering 
>> wrote:
>>
>>> Just curious here ... when did EBS become OK for C*? Didn't they always
>>> push towards using ephemeral disks?
>>>
>>> On Wed, Feb 3, 2016 at 12:17 PM, Ben Bromhead 
>>> wrote:
>>>
>>>> For what it's worth we've tried d2 instances and they encourage
>>>> terrible things like super dense nodes (increases your replacement time).
>>>> In terms of useable storage I would go with gp2 EBS on a m4 based instance.
>>>>
>>>> On Mon, 1 Feb 2016 at 14:25 Jack Krupansky 
>>>> wrote:
>>>>
>>>>> Ah, yes, the good old days of m1.large.
>>>>>
>>&g

Re: EC2 storage options for C*

2016-02-03 Thread Sebastian Estevez

Good points Bryan, some more color:

Regular EBS is *not* okay for C*. But AWS has some nicer EBS now that has
performed okay recently:

http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSVolumeTypes.html

https://www.youtube.com/watch?v=1R-mgOcOSd4


The cloud vendors are moving toward shared storage and we can't ignore that
in the long term (they will push us in that direction financially).
Fortunately their shared storage offerings are also getting better. For
example google's elastic storage offerring provides very reliable latencies
which is what we care the most
about, not iops.

On the practical side, a key thing I've noticed with real deployments is
that the size of the volume affects how fast it will perform and how stable
it's latencies will be so make sure to get large EBS volumes > 1tb to get
decent performance, even if your nodes aren't that dense.




All the best,


[image: datastax_logo.png] 

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png]  [image:
facebook.png]  [image: twitter.png]
 [image: g+.png]







DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Wed, Feb 3, 2016 at 7:23 PM, Bryan Cheng  wrote:

> From my experience, EBS has transitioned from "stay the hell away" to "OK"
> as the new GP2 SSD type has come out and stabilized over the last few
> years, especially with the addition of EBS-optimized instances that have
> dedicated EBS bandwidth. The latter has really helped to stabilize the
> problematic 99.9-percentile latency spikes that use to plague EBS volumes.
>
> EBS (IMHO) has always had operational advantages, but inconsistent latency
> and generally poor performance in the past lead many to disregard it.
>
> On Wed, Feb 3, 2016 at 4:09 PM, James Rothering 
> wrote:
>
>> Just curious here ... when did EBS become OK for C*? Didn't they always
>> push towards using ephemeral disks?
>>
>> On Wed, Feb 3, 2016 at 12:17 PM, Ben Bromhead 
>> wrote:
>>
>>> For what it's worth we've tried d2 instances and they encourage terrible
>>> things like super dense nodes (increases your replacement time). In terms
>>> of useable storage I would go with gp2 EBS on a m4 based instance.
>>>
>>> On Mon, 1 Feb 2016 at 14:25 Jack Krupansky 
>>> wrote:
>>>
 Ah, yes, the good old days of m1.large.

 -- Jack Krupansky

 On Mon, Feb 1, 2016 at 5:12 PM, Jeff Jirsa 
 wrote:

> A lot of people use the old gen instances (m1 in particular) because
> they came with a ton of effectively free ephemeral storage (up to 1.6TB).
> Whether or not they’re viable is a decision for each user to make. They’re
> very, very commonly used for C*, though. At a time when EBS was not
> sufficiently robust or reliable, a cluster of m1 instances was the de 
> facto
> standard.
>
> The canonical “best practice” in 2015 was i2. We believe we’ve made a
> compelling argument to use m4 or c4 instead of i2. There exists a company
> we know currently testing d2 at scale, though I’m not sure they have much
> in terms of concrete results at this time.
>
> - Jeff
>
> From: Jack Krupansky
> Reply-To: "user@cassandra.apache.org"
> Date: Monday, February 1, 2016 at 1:55 PM
>
> To: "user@cassandra.apache.org"
> Subject: Re: EC2 storage options for C*
>
> Thanks. My typo - I referenced "C2 Dense Storage" which is really "D2
> Dense Storage".
>
> The remaining question is whether any of the "Previous Generation
> Instances" should be publicly recommended going forward.
>
> And whether non-SSD instances should be recommended going forward as
> well. sure, technically, someone could use the legacy instances, but the
> question is what we should be recommending as best practice going forward.
>
> Yeah, the i2 instances look like the sweet spot for any non-EBS
> clusters.
>
> -- Jack Krupansky
>
> On Mon, Feb 1, 2016 at 4:30 PM, Steve Robenalt  > wrote:
>
>> Hi Jack,
>>
>> At the bottom of the instance-types page, there is a link to the
>> previous generations, which includes the older series (m1, m2, etc), many
>> of which have HDD options.
>>
>> There are also the d2

Re: Java Driver Question

2016-02-02 Thread Sebastian Estevez

Yes, topology changes get pushed to the client via the control connection:

https://github.com/datastax/java-driver/blob/2.1/driver-core/src/main/java/com/datastax/driver/core/Cluster.java#L61

all the best,

Sebastián
On Feb 2, 2016 10:47 AM, "Richard L. Burton III"  wrote:

> In the case of adding more nodes to the cluster, would my application have
> to be restarted to detect the new nodes (as opposed to a node acting like a
> coordinator).
>
> e.g., Having the Java code connect using 3 known contact points and when a
> 4th and 5th node are added, the driver will become aware of these nodes
> without havng to be restarted?
>
> --
> -Richard L. Burton III
> @rburton
>

Re: automated CREATE TABLE just nuked my cluster after a 2.0 -> 2.1 upgrade....

2016-02-02 Thread Sebastian Estevez

Hi Ken,

Earlier in this thread I posted a link to
https://issues.apache.org/jira/browse/CASSANDRA-9424

That is the fix for these schema disagreement issues and as ay commented,
the plan is to use CAS. Until then we have to treat schema delicately.

all the best,

Sebastián
On Feb 2, 2016 9:48 AM, "Ken Hancock"  wrote:

> So this rings odd to me.  If you can accomplish the same thing by using a
> CAS operation, why not fix create table if not exist so that if your are
> writing an application that creates the table on startup, that the
> application is safe to run on multiple nodes and uses CAS to safeguard
> multiple concurrent creations?
>
>
> On Tue, Jan 26, 2016 at 12:32 PM, Eric Stevens  wrote:
>
>> There's still a race condition there, because two clients could SELECT at
>> the same time as each other, then both INSERT.
>>
>> You'd be better served with a CAS operation, and let Paxos guarantee
>> at-most-once execution.
>>
>> On Tue, Jan 26, 2016 at 9:06 AM Francisco Reyes 
>> wrote:
>>
>>> On 01/22/2016 10:29 PM, Kevin Burton wrote:
>>>
>>> I sort of agree.. but we are also considering migrating to hourly
>>> tables.. and what if the single script doesn't run.
>>>
>>> I like having N nodes make changes like this because in my experience
>>> that central / single box will usually fail at the wrong time :-/
>>>
>>>
>>>
>>> On Fri, Jan 22, 2016 at 6:47 PM, Jonathan Haddad 
>>> wrote:
>>>
 Instead of using ZK, why not solve your concurrency problem by removing
 it?  By that, I mean simply have 1 process that creates all your tables
 instead of creating a race condition intentionally?

 On Fri, Jan 22, 2016 at 6:16 PM Kevin Burton 
 wrote:

> Not sure if this is a bug or not or kind of a *fuzzy* area.
>
> In 2.0 this worked fine.
>
> We have a bunch of automated scripts that go through and create
> tables... one per day.
>
> at midnight UTC our entire CQL went offline.. .took down our whole
> app.  ;-/
>
> The resolution was a full CQL shut down and then a drop table to
> remove the bad tables...
>
> pretty sure the issue was with schema disagreement.
>
> All our CREATE TABLE use IF NOT EXISTS but I think the IF NOT
> EXISTS only checks locally?
>
> My work around is going to be to use zookeeper to create a mutex lock
> during this operation.
>
> Any other things I should avoid?
>
>
> --
> We’re hiring if you know of any awesome Java Devops or Linux
> Operations Engineers!
>
> Founder/CEO Spinn3r.com
> Location: *San Francisco, CA*
> blog:  
> http://burtonator.wordpress.com
> … or check out my Google+ profile
> 
>
>
>>>
>>>
>>> --
>>> We’re hiring if you know of any awesome Java Devops or Linux Operations
>>> Engineers!
>>>
>>> Founder/CEO Spinn3r.com
>>> Location: *San Francisco, CA*
>>> blog:  http://burtonator.wordpress.com
>>> … or check out my Google+ profile
>>> 
>>>
>>>
>>> One way to accomplish both, a single process doing the work and having
>>> multiple machines be able to do it, is to have a control table.
>>>
>>> You can have a table that lists what tables have been created and force
>>> concistency all. In this table you list the names of tables created. If a
>>> table name is in there, it doesn't need to be created again.
>>>
>>
>
>
> --
> *Ken Hancock *| System Architect, Advanced Advertising
> SeaChange International
> 50 Nagog Park
> Acton, Massachusetts 01720
> ken.hanc...@schange.com | www.schange.com | NASDAQ:SEAC
> 
> Office: +1 (978) 889-3329 | [image: Google Talk:] ken.hanc...@schange.com
>  | [image: Skype:]hancockks | [image: Yahoo IM:]hancockks [image:
> LinkedIn] 
>
> [image: SeaChange International]
> 
> This e-mail and any attachments may contain information which is SeaChange
> International confidential. The information enclosed is intended only for
> the addressees herein and may not be copied or forwarded without permission
> from SeaChange International.
>

Re: automated CREATE TABLE just nuked my cluster after a 2.0 -> 2.1 upgrade....

2016-01-25 Thread Sebastian Estevez

You have to wait for schema agreement which most drivers should do by
default. At least have a check schema agreement method you can use.

https://datastax.github.io/java-driver/2.1.9/features/metadata/

The new cqlsh uses the python driver so the same should apply:

https://datastax.github.io/python-driver/api/cassandra/cluster.html

But check 'nodetool describecluster' to confirm that all nodes have the
same schema version.

Note: This will not help you in the concurrency / multiple writers
scenario.

all the best,

Sebastián
On Jan 23, 2016 7:29 PM, "Kevin Burton"  wrote:

> Once the CREATE TABLE returns in cqlsh (or programatically) is it safe to
> assume it's on all nodes at that point?
>
> If not I'll have to put in even more logic to handle this case..
>
> On Fri, Jan 22, 2016 at 9:22 PM, Jack Krupansky 
> wrote:
>
>> I recall that there was some discussion last year about this issue of how
>> risky it is to do an automated CREATE TABLE IF NOT EXISTS due to the
>> unpredictable amount of time it takes for the table creation to fully
>> propagate around the full cluster. I think it was recognized as a real
>> problem, but without an immediate solution, so the recommended practice for
>> now is to only manually perform the operation (sure, it can be scripted,
>> but only under manual control) to assure that the operation completes and
>> that only one attempt is made to create the table. I don't recall if there
>> was a specific Jira assigned, and the antipattern doc doesn't appear to
>> reference this scenario. Maybe a committer can shed some more light.
>>
>> -- Jack Krupansky
>>
>> On Fri, Jan 22, 2016 at 10:29 PM, Kevin Burton 
>> wrote:
>>
>>> I sort of agree.. but we are also considering migrating to hourly
>>> tables.. and what if the single script doesn't run.
>>>
>>> I like having N nodes make changes like this because in my experience
>>> that central / single box will usually fail at the wrong time :-/
>>>
>>>
>>>
>>> On Fri, Jan 22, 2016 at 6:47 PM, Jonathan Haddad 
>>> wrote:
>>>
 Instead of using ZK, why not solve your concurrency problem by removing
 it?  By that, I mean simply have 1 process that creates all your tables
 instead of creating a race condition intentionally?

 On Fri, Jan 22, 2016 at 6:16 PM Kevin Burton 
 wrote:

> Not sure if this is a bug or not or kind of a *fuzzy* area.
>
> In 2.0 this worked fine.
>
> We have a bunch of automated scripts that go through and create
> tables... one per day.
>
> at midnight UTC our entire CQL went offline.. .took down our whole
> app.  ;-/
>
> The resolution was a full CQL shut down and then a drop table to
> remove the bad tables...
>
> pretty sure the issue was with schema disagreement.
>
> All our CREATE TABLE use IF NOT EXISTS but I think the IF NOT
> EXISTS only checks locally?
>
> My work around is going to be to use zookeeper to create a mutex lock
> during this operation.
>
> Any other things I should avoid?
>
>
> --
>
> We’re hiring if you know of any awesome Java Devops or Linux
> Operations Engineers!
>
> Founder/CEO Spinn3r.com
> Location: *San Francisco, CA*
> blog: http://burtonator.wordpress.com
> … or check out my Google+ profile
> 
>
>
>>>
>>>
>>> --
>>>
>>> We’re hiring if you know of any awesome Java Devops or Linux Operations
>>> Engineers!
>>>
>>> Founder/CEO Spinn3r.com
>>> Location: *San Francisco, CA*
>>> blog: http://burtonator.wordpress.com
>>> … or check out my Google+ profile
>>> 
>>>
>>>
>>
>
>
> --
>
> We’re hiring if you know of any awesome Java Devops or Linux Operations
> Engineers!
>
> Founder/CEO Spinn3r.com
> Location: *San Francisco, CA*
> blog: http://burtonator.wordpress.com
> … or check out my Google+ profile
> 
>
>

Re: automated CREATE TABLE just nuked my cluster after a 2.0 -> 2.1 upgrade....

2016-01-23 Thread Sebastian Estevez

CASSANDRA-9424 

All the best,

[image: datastax_logo.png] 

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png]  [image:
facebook.png]  [image: twitter.png]
 [image: g+.png]

DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Sat, Jan 23, 2016 at 12:22 AM, Jack Krupansky 
wrote:

> I recall that there was some discussion last year about this issue of how
> risky it is to do an automated CREATE TABLE IF NOT EXISTS due to the
> unpredictable amount of time it takes for the table creation to fully
> propagate around the full cluster. I think it was recognized as a real
> problem, but without an immediate solution, so the recommended practice for
> now is to only manually perform the operation (sure, it can be scripted,
> but only under manual control) to assure that the operation completes and
> that only one attempt is made to create the table. I don't recall if there
> was a specific Jira assigned, and the antipattern doc doesn't appear to
> reference this scenario. Maybe a committer can shed some more light.
>
> -- Jack Krupansky
>
> On Fri, Jan 22, 2016 at 10:29 PM, Kevin Burton  wrote:
>
>> I sort of agree.. but we are also considering migrating to hourly
>> tables.. and what if the single script doesn't run.
>>
>> I like having N nodes make changes like this because in my experience
>> that central / single box will usually fail at the wrong time :-/
>>
>>
>>
>> On Fri, Jan 22, 2016 at 6:47 PM, Jonathan Haddad 
>> wrote:
>>
>>> Instead of using ZK, why not solve your concurrency problem by removing
>>> it?  By that, I mean simply have 1 process that creates all your tables
>>> instead of creating a race condition intentionally?
>>>
>>> On Fri, Jan 22, 2016 at 6:16 PM Kevin Burton  wrote:
>>>
 Not sure if this is a bug or not or kind of a *fuzzy* area.

 In 2.0 this worked fine.

 We have a bunch of automated scripts that go through and create
 tables... one per day.

 at midnight UTC our entire CQL went offline.. .took down our whole app.
  ;-/

 The resolution was a full CQL shut down and then a drop table to remove
 the bad tables...

 pretty sure the issue was with schema disagreement.

 All our CREATE TABLE use IF NOT EXISTS but I think the IF NOT
 EXISTS only checks locally?

 My work around is going to be to use zookeeper to create a mutex lock
 during this operation.

 Any other things I should avoid?

 --

 We’re hiring if you know of any awesome Java Devops or Linux Operations
 Engineers!

 Founder/CEO Spinn3r.com
 Location: *San Francisco, CA*
 blog: http://burtonator.wordpress.com
 … or check out my Google+ profile

>>
>>
>> --
>>
>> We’re hiring if you know of any awesome Java Devops or Linux Operations
>> Engineers!
>>
>> Founder/CEO Spinn3r.com
>> Location: *San Francisco, CA*
>> blog: http://burtonator.wordpress.com
>> … or check out my Google+ profile
>> 
>>
>>
>

Re: Getting error while issuing Cassandra stress

2016-01-22 Thread Sebastian Estevez

https://github.com/brianmhess/cassandra-loader

All the best,


[image: datastax_logo.png] <http://www.datastax.com/>

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png] <https://www.linkedin.com/company/datastax> [image:
facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
<https://twitter.com/datastax> [image: g+.png]
<https://plus.google.com/+Datastax/about>
<http://feeds.feedburner.com/datastax>
<http://goog_410786983>


<http://www.datastax.com/gartner-magic-quadrant-odbms>

DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Fri, Jan 22, 2016 at 4:37 PM, Bhuvan Rawal  wrote:

> Yes im specifying -node parameter to stress, otherwise it throws network
> connection failed.
>
> Can you point me to a sample java application to test pushing data from
> external server? Let's see if that works
>
> On Sat, Jan 23, 2016 at 2:55 AM, Sebastian Estevez <
> sebastian.este...@datastax.com> wrote:
>
>> when i opened my cassandra-rackdc.properties i saw that DC names were DC1
>>> & DC2, rack name was RAC1 . Please note that this is the default
>>> configuration, I have not modified any file.
>>
>>
>> cassandra-rackdc.properties is only respected based on your snitch
>> <https://docs.datastax.com/en/cassandra/2.1/cassandra/architecture/architectureSnitchesAbout_c.html>
>> .
>>
>> $ cqlsh
>>> Connection error: ('Unable to connect to any servers', {'127.0.0.1':
>>> error(111, "Tried connecting to [('127.0.0.1', 9042)]. Last error:
>>> Connection refused")})
>>> whereas
>>> $ cqlsh 
>>> works fine
>>> is that the reason why the cassandra-stress is not able to communicate
>>> with other replicas?
>>
>>
>> Are you providing the -node parameter to stress
>> <http://docs.datastax.com/en/cassandra/2.1/cassandra/tools/toolsCStress_t.html>
>> ?
>>
>>
>>
>> All the best,
>>
>>
>> [image: datastax_logo.png] <http://www.datastax.com/>
>>
>> Sebastián Estévez
>>
>> Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com
>>
>> [image: linkedin.png] <https://www.linkedin.com/company/datastax> [image:
>> facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
>> <https://twitter.com/datastax> [image: g+.png]
>> <https://plus.google.com/+Datastax/about>
>> <http://feeds.feedburner.com/datastax>
>> <http://goog_410786983>
>>
>>
>> <http://www.datastax.com/gartner-magic-quadrant-odbms>
>>
>> DataStax is the fastest, most scalable distributed database technology,
>> delivering Apache Cassandra to the world’s most innovative enterprises.
>> Datastax is built to be agile, always-on, and predictably scalable to any
>> size. With more than 500 customers in 45 countries, DataStax is the
>> database technology and transactional backbone of choice for the worlds
>> most innovative companies such as Netflix, Adobe, Intuit, and eBay.
>>
>> On Fri, Jan 22, 2016 at 4:07 PM, Bhuvan Rawal 
>> wrote:
>>
>>> I had a look at the jira below:
>>> https://issues.apache.org/jira/browse/CASSANDRA-7905
>>>
>>> when i opened my cassandra-rackdc.properties i saw that DC names were
>>> DC1 & DC2, rack name was RAC1 . Please note that this is the default
>>> configuration, I have not modified any file.
>>>
>>> There is another point of concern here which might be relevant to
>>> previous one as well, im not able to login to cqlsh directly, i.e. I have
>>> to specify ip as well even when im logged in to that machine.
>>>
>>> $ cqlsh
>>> Connection error: ('Unable to connect to any servers', {'127.0.0.1':
>>> error(111, "Tried connecting to [('127.0.0.1', 9042)]. Last error:
>>> Connection refused")})
>>>
>>> whereas
>>> $ cqlsh 
>>> works fine
>>>
>>> is that the reason why the cassandra-stress is not able to communicate
>>> with other replicas?
>>>
>>> On Sat, Jan 23, 2016 at 1:37 AM, Sebastian

Re: Getting error while issuing Cassandra stress

2016-01-22 Thread Sebastian Estevez

>
> when i opened my cassandra-rackdc.properties i saw that DC names were DC1
> & DC2, rack name was RAC1 . Please note that this is the default
> configuration, I have not modified any file.


cassandra-rackdc.properties is only respected based on your snitch
<https://docs.datastax.com/en/cassandra/2.1/cassandra/architecture/architectureSnitchesAbout_c.html>
.

$ cqlsh
> Connection error: ('Unable to connect to any servers', {'127.0.0.1':
> error(111, "Tried connecting to [('127.0.0.1', 9042)]. Last error:
> Connection refused")})
> whereas
> $ cqlsh 
> works fine
> is that the reason why the cassandra-stress is not able to communicate
> with other replicas?


Are you providing the -node parameter to stress
<http://docs.datastax.com/en/cassandra/2.1/cassandra/tools/toolsCStress_t.html>
?



All the best,


[image: datastax_logo.png] <http://www.datastax.com/>

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png] <https://www.linkedin.com/company/datastax> [image:
facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
<https://twitter.com/datastax> [image: g+.png]
<https://plus.google.com/+Datastax/about>
<http://feeds.feedburner.com/datastax>
<http://goog_410786983>


<http://www.datastax.com/gartner-magic-quadrant-odbms>

DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Fri, Jan 22, 2016 at 4:07 PM, Bhuvan Rawal  wrote:

> I had a look at the jira below:
> https://issues.apache.org/jira/browse/CASSANDRA-7905
>
> when i opened my cassandra-rackdc.properties i saw that DC names were DC1
> & DC2, rack name was RAC1 . Please note that this is the default
> configuration, I have not modified any file.
>
> There is another point of concern here which might be relevant to previous
> one as well, im not able to login to cqlsh directly, i.e. I have to specify
> ip as well even when im logged in to that machine.
>
> $ cqlsh
> Connection error: ('Unable to connect to any servers', {'127.0.0.1':
> error(111, "Tried connecting to [('127.0.0.1', 9042)]. Last error:
> Connection refused")})
>
> whereas
> $ cqlsh 
> works fine
>
> is that the reason why the cassandra-stress is not able to communicate
> with other replicas?
>
> On Sat, Jan 23, 2016 at 1:37 AM, Sebastian Estevez <
> sebastian.este...@datastax.com> wrote:
>
>> Sorry I missed that.
>>
>> Both your nodetool status and keyspace replication settings say Cassandra
>> and Analytics for the DC names. I'm not sure where you're seeing DC1, DC2,
>> etc. and why you suspect that is the problem.
>>
>> All the best,
>>
>>
>> [image: datastax_logo.png] <http://www.datastax.com/>
>>
>> Sebastián Estévez
>>
>> Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com
>>
>> [image: linkedin.png] <https://www.linkedin.com/company/datastax> [image:
>> facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
>> <https://twitter.com/datastax> [image: g+.png]
>> <https://plus.google.com/+Datastax/about>
>> <http://feeds.feedburner.com/datastax>
>> <http://goog_410786983>
>>
>>
>> <http://www.datastax.com/gartner-magic-quadrant-odbms>
>>
>> DataStax is the fastest, most scalable distributed database technology,
>> delivering Apache Cassandra to the world’s most innovative enterprises.
>> Datastax is built to be agile, always-on, and predictably scalable to any
>> size. With more than 500 customers in 45 countries, DataStax is the
>> database technology and transactional backbone of choice for the worlds
>> most innovative companies such as Netflix, Adobe, Intuit, and eBay.
>>
>> On Fri, Jan 22, 2016 at 1:45 PM, Bhuvan Rawal 
>> wrote:
>>
>>> Hi Sebastian,
>>>
>>> I had attached nodetool status output in previous mail, pasting it again
>>> :
>>>
>>> $ nodetool status Datacenter: Analytics =
>>> Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load
>>> Tokens Owns Host ID Rack UN 10.41.55.17 428.5 KB 256 ?
>>> 39d6d585-e641-4046-9d0b-797356597b5e rack1 UN 10.41.55.19

Re: Getting error while issuing Cassandra stress

2016-01-22 Thread Sebastian Estevez

Sorry I missed that.

Both your nodetool status and keyspace replication settings say Cassandra
and Analytics for the DC names. I'm not sure where you're seeing DC1, DC2,
etc. and why you suspect that is the problem.

All the best,


[image: datastax_logo.png] <http://www.datastax.com/>

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png] <https://www.linkedin.com/company/datastax> [image:
facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
<https://twitter.com/datastax> [image: g+.png]
<https://plus.google.com/+Datastax/about>
<http://feeds.feedburner.com/datastax>
<http://goog_410786983>


<http://www.datastax.com/gartner-magic-quadrant-odbms>

DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Fri, Jan 22, 2016 at 1:45 PM, Bhuvan Rawal  wrote:

> Hi Sebastian,
>
> I had attached nodetool status output in previous mail, pasting it again :
>
> $ nodetool status Datacenter: Analytics =
> Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load
> Tokens Owns Host ID Rack UN 10.41.55.17 428.5 KB 256 ?
> 39d6d585-e641-4046-9d0b-797356597b5e rack1 UN 10.41.55.19 404.44 KB 256 ?
> 69edf930-efd9-4d74-a798-f3d4ac02e516 rack1 UN 10.41.55.18 423.21 KB 256 ?
> b74bab13-09b2-4760-bce9-c8ef05e50f6d rack1 UN 10.41.55.20 683.23 KB 256 ?
> fb5c4fed-6e1e-4ea8-838d-358106906830 rack1 Datacenter: Cassandra
> = Status=Up/Down |/
> State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID
> Rack UN 10.41.55.15 209.4 KB 256 ? ffc3b9a0-5d5c-4a3d-a99e-49d255731278
> rack1 UN 10.41.55.21 227.44 KB 256 ? c68deba4-b9a2-43fc-bb13-6af74c88c210
> rack1 UN 10.41.55.23 222.71 KB 256 ? 8229aa87-af00-48fa-ad6b-3066d3dc0e58
> rack1 UN 10.41.55.22 218.72 KB 256 ? c7ba84fd-7992-41de-8c88-11574a72db99
> rack1
>
> Regards,
> Bhuvan Rawal
>
> On Sat, Jan 23, 2016 at 12:11 AM, Sebastian Estevez <
> sebastian.este...@datastax.com> wrote:
>
>> The output of `nodetool status` would help us diagnose.
>>
>> All the best,
>>
>>
>> [image: datastax_logo.png] <http://www.datastax.com/>
>>
>> Sebastián Estévez
>>
>> Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com
>>
>> [image: linkedin.png] <https://www.linkedin.com/company/datastax> [image:
>> facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
>> <https://twitter.com/datastax> [image: g+.png]
>> <https://plus.google.com/+Datastax/about>
>> <http://feeds.feedburner.com/datastax>
>> <http://goog_410786983>
>>
>>
>> <http://www.datastax.com/gartner-magic-quadrant-odbms>
>>
>> DataStax is the fastest, most scalable distributed database technology,
>> delivering Apache Cassandra to the world’s most innovative enterprises.
>> Datastax is built to be agile, always-on, and predictably scalable to any
>> size. With more than 500 customers in 45 countries, DataStax is the
>> database technology and transactional backbone of choice for the worlds
>> most innovative companies such as Netflix, Adobe, Intuit, and eBay.
>>
>> On Fri, Jan 22, 2016 at 1:39 PM, Bhuvan Rawal 
>> wrote:
>>
>>> Thanks for the response Alain,
>>>
>>> cqlsh> create keyspace mykeyspace WITH replication =
>>> {'class':'NetworkTopologyStrategy', 'Analytics':2, 'Cassandra':3}
>>> cqlsh> use mykeyspace;
>>> cqlsh:mykeyspace>create table mytable (id int primary key, name text,
>>> address text, phone text);
>>> cqlsh:mykeyspace> insert into mytable (id, name, address, phone) values
>>> (1, 'Kiyu','Texas', '555-1212'); # and other similar statement
>>> I then issued the below command from every node and found consistent
>>> results.
>>> cqlsh:mykeyspace> select * from mytable;
>>>
>>> // Then i repeated the above steps for NetworkTopologyStrategy and found
>>> same results
>>>
>>> I ran basic cassandra stress
>>> seed1 - seed of datacenter 1
>>>  $ cassandra-stress write n=5 -rate threads=4 -node any_random_ip
>>>  $ cassandra-stress write n=5 -rate threads=4

Re: Getting error while issuing Cassandra stress

2016-01-22 Thread Sebastian Estevez

The output of `nodetool status` would help us diagnose.

All the best,


[image: datastax_logo.png] 

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png]  [image:
facebook.png]  [image: twitter.png]
 [image: g+.png]







DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Fri, Jan 22, 2016 at 1:39 PM, Bhuvan Rawal  wrote:

> Thanks for the response Alain,
>
> cqlsh> create keyspace mykeyspace WITH replication =
> {'class':'NetworkTopologyStrategy', 'Analytics':2, 'Cassandra':3}
> cqlsh> use mykeyspace;
> cqlsh:mykeyspace>create table mytable (id int primary key, name text,
> address text, phone text);
> cqlsh:mykeyspace> insert into mytable (id, name, address, phone) values
> (1, 'Kiyu','Texas', '555-1212'); # and other similar statement
> I then issued the below command from every node and found consistent
> results.
> cqlsh:mykeyspace> select * from mytable;
>
> // Then i repeated the above steps for NetworkTopologyStrategy and found
> same results
>
> I ran basic cassandra stress
> seed1 - seed of datacenter 1
>  $ cassandra-stress write n=5 -rate threads=4 -node any_random_ip
>  $ cassandra-stress write n=5 -rate threads=4 -node seed1
>  $ cassandra-stress write n=5 -rate threads=4 -node seed1,seed2
>  $ cassandra-stress write n=5 -rate threads=4 -node
> all_8_ip_comma_seperated
>  $ cassandra-stress write n=100 cl=one -mode native cql3 -schema
> keyspace="keyspace1" -pop seq=1..100 -node ip1,ip2,ip3,ip4
>
> All of them threw the exception
> *com.datastax.driver.core.exceptions.UnavailableException: Not enough
> replica available for query at consistency LOCAL_ONE (1 required but only 0
> alive)*
>
>
> I have a feeling that the issue is with datacenter name for some reason,
> because in some config files I found DC name to be like DC1/DC2/DC3 in some
> it is like Cassandra/Analytics (The ones I had specified while
> installation). Im unsure which yaml/property file to look for correct
> inconsistency.
>
> (C*heers :) - im so tempted to copy that)
>
> Regards,
> Bhuvan
>
> On Fri, Jan 22, 2016 at 8:47 PM, Alain RODRIGUEZ 
> wrote:
>
>> Hi,
>>
>> The the exact command you ran (stress-tool with options) could be useful
>> to help you on that.
>>
>> However, Im able to create keyspace, tables and insert data using cqlsh
>>> and it is replicating fine to all the nodes.
>>
>>
>> Having the schema might be useful too.
>>
>> Did you ran the cqlsh and the stress-tool from the same server ? If not,
>> you might want to check the port you use (9042/9160/...) are open.
>> Also, cqlsh uses local_one by default too. If both commands were run
>> against the same DC, from the same machine they should behave the same way.
>> Are they ?
>>
>> C*heers,
>>
>> -
>> Alain
>>
>> The Last Pickle
>> http://www.thelastpickle.com
>>
>>
>> 2016-01-22 9:57 GMT+01:00 Bhuvan Rawal :
>>
>>> Hi,
>>>
>>> i have created a POC cluster with 2 DC , each having 4 nodes with DSE
>>> 4.8.1 installed.
>>>
>>> On issuing cassandra stress im getting an error  and data is not being
>>> inserted:
>>> *com.datastax.driver.core.exceptions.UnavailableException: Not enough
>>> replica available for query at consistency LOCAL_ONE (1 required but only 0
>>> alive)*
>>>
>>> However, Im able to create keyspace, tables and insert data using cqlsh
>>> and it is replicating fine to all the nodes.
>>>
>>> Details of the cluster can be found below (all the nodes seem to be
>>> alive and kicking):
>>>
>>> $ nodetool status Datacenter: Analytics =
>>> Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load
>>> Tokens Owns Host ID Rack UN 10.41.55.17 428.5 KB 256 ?
>>> 39d6d585-e641-4046-9d0b-797356597b5e rack1 UN 10.41.55.19 404.44 KB 256 ?
>>> 69edf930-efd9-4d74-a798-f3d4ac02e516 rack1 UN 10.41.55.18 423.21 KB 256 ?
>>> b74bab13-09b2-4760-bce9-c8ef05e50f6d rack1 UN 10.41.55.20 683.23 KB 256 ?
>>> fb5c4fed-6e1e-4ea8-838d-358106906830 rack1 Datacenter: Cassandra
>>> = Status=Up/Down |/
>>> State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID
>>> Rack UN 10.41.55.15 209.4 KB 256 ? ffc3b9a0-5d5c-4a3d-a99e-49d255731278
>>> rack1 UN 10.41.55.21 227.44 KB 256 ? c68deba4-b9a2-43fc-bb13-6af74c88c210
>>> rack1 UN 10.41.55.23 222.71 KB 256 ? 8229a

Re: compaction throughput

2016-01-21 Thread Sebastian Estevez

@penguin There have been steady improvements in the different compaction
strategies recently but not major re-writes.

All the best,


[image: datastax_logo.png] <http://www.datastax.com/>

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png] <https://www.linkedin.com/company/datastax> [image:
facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
<https://twitter.com/datastax> [image: g+.png]
<https://plus.google.com/+Datastax/about>
<http://feeds.feedburner.com/datastax>
<http://goog_410786983>


<http://www.datastax.com/gartner-magic-quadrant-odbms>

DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Thu, Jan 21, 2016 at 9:12 AM, Kai Wang  wrote:

> I am using 2.2.4 and have seen multiple compactors running on the same
> table. The number of compactors seems to be controlled by
> concurrent_compactors. As of type of compactions, I've seen normal
> compaction, tombstone compaction. Validation and Anticompaction seem to
> always be single threaded.
>
> On Thu, Jan 21, 2016 at 8:28 AM, PenguinWhispererThe . <
> th3penguinwhispe...@gmail.com> wrote:
>
>> Thanks for that clarification Sebastian! That's really good to know! I
>> never took increasing this value in consideration because of my previous
>> experience.
>>
>> In my case I had a table that was compacting over and over... and only
>> one CPU was used. So that made me believe it was not multithreaded (I
>> actually believe I asked this on IRC however it's been a few months ago so
>> I might be wrong).
>>
>> Have there been behavioral changes on this lately? (I was using 2.0.9 or
>> 2.0.11 I believe).
>>
>> 2016-01-21 14:15 GMT+01:00 Sebastian Estevez <
>> sebastian.este...@datastax.com>:
>>
>>> >So compaction of one table will NOT spread over different cores.
>>>
>>> This is not exactly true. You actually can have multiple compactions
>>> running at the same time on the same table, it just doesn't happen all that
>>> often. You essentially would have to have two sets of sstables that are
>>> both eligible for compactions at the same time.
>>>
>>> all the best,
>>>
>>> Sebastián
>>> On Jan 21, 2016 7:41 AM, "PenguinWhispererThe ." <
>>> th3penguinwhispe...@gmail.com> wrote:
>>>
>>>> After having some issues myself with compaction I think it's noteworthy
>>>> to explicitly state that compaction of a table can only run on one CPU. So
>>>> compaction of one table will NOT spread over different cores.
>>>> To really have use of concurrent_compactors you need to have multiple
>>>> table compactions initiated at the same time. If those are small they'll
>>>> finish way earlier resulting in only one core using 100% as compaction is
>>>> generally CPU bound (unless your disks can't keep up).
>>>> I believe it's better to be CPU(core) bound on one core(or at least not
>>>> all) for compaction than disk IO bound as this would result in writes and
>>>> reads, ... having performance impact.
>>>> Compaction is a maintenance task so it shouldn't be eating all your
>>>> resources.
>>>>
>>>>
>>>> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
>>>>  This
>>>> email has been sent from a virus-free computer protected by Avast.
>>>> www.avast.com
>>>> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
>>>> <#-1919795192_-2069969251_1162782367_-1582318301_DDB4FAA8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>>>>
>>>> 2016-01-16 0:18 GMT+01:00 Kai Wang :
>>>>
>>>>> Jeff & Sebastian,
>>>>>
>>>>> Thanks for the reply. There are 12 cores but in my case C* only uses
>>>>> one core most of the time. *nodetool compactionstats* shows there's
>>>>> only one compactor running. I can see C* process only uses one core. So I
>>>>> guess I should've asked the questi

Re: compaction throughput

2016-01-21 Thread Sebastian Estevez

>So compaction of one table will NOT spread over different cores.

This is not exactly true. You actually can have multiple compactions
running at the same time on the same table, it just doesn't happen all that
often. You essentially would have to have two sets of sstables that are
both eligible for compactions at the same time.

all the best,

Sebastián
On Jan 21, 2016 7:41 AM, "PenguinWhispererThe ." <
th3penguinwhispe...@gmail.com> wrote:

> After having some issues myself with compaction I think it's noteworthy to
> explicitly state that compaction of a table can only run on one CPU. So
> compaction of one table will NOT spread over different cores.
> To really have use of concurrent_compactors you need to have multiple
> table compactions initiated at the same time. If those are small they'll
> finish way earlier resulting in only one core using 100% as compaction is
> generally CPU bound (unless your disks can't keep up).
> I believe it's better to be CPU(core) bound on one core(or at least not
> all) for compaction than disk IO bound as this would result in writes and
> reads, ... having performance impact.
> Compaction is a maintenance task so it shouldn't be eating all your
> resources.
>
>
> 
>  This
> email has been sent from a virus-free computer protected by Avast.
> www.avast.com
> 
> <#-1582318301_DDB4FAA8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>
> 2016-01-16 0:18 GMT+01:00 Kai Wang :
>
>> Jeff & Sebastian,
>>
>> Thanks for the reply. There are 12 cores but in my case C* only uses one
>> core most of the time. *nodetool compactionstats* shows there's only one
>> compactor running. I can see C* process only uses one core. So I guess I
>> should've asked the question more clearly:
>>
>> 1. Is ~25 M/s a reasonable compaction throughput for one core?
>> 2. Is there any configuration that affects single core compaction
>> throughput?
>> 3. Is concurrent_compactors the only option to parallelize compaction? If
>> so, I guess it's the compaction strategy itself that decides when to
>> parallelize and when to block on one core. Then there's not much we can do
>> here.
>>
>> Thanks.
>>
>> On Fri, Jan 15, 2016 at 5:23 PM, Jeff Jirsa 
>> wrote:
>>
>>> With SSDs, the typical recommendation is up to 0.8-1 compactor per core
>>> (depending on other load).  How many CPU cores do you have?
>>>
>>>
>>> From: Kai Wang
>>> Reply-To: "user@cassandra.apache.org"
>>> Date: Friday, January 15, 2016 at 12:53 PM
>>> To: "user@cassandra.apache.org"
>>> Subject: compaction throughput
>>>
>>> Hi,
>>>
>>> I am trying to figure out the bottleneck of compaction on my node. The
>>> node is CentOS 7 and has SSDs installed. The table is configured to use
>>> LCS. Here is my compaction related configs in cassandra.yaml:
>>>
>>> compaction_throughput_mb_per_sec: 160
>>> concurrent_compactors: 4
>>>
>>> I insert about 10G of data and start observing compaction.
>>>
>>> *nodetool compaction* shows most of time there is one compaction.
>>> Sometimes there are 3-4 (I suppose this is controlled by
>>> concurrent_compactors). During the compaction, I see one CPU core is 100%.
>>> At that point, disk IO is about 20-25 M/s write which is much lower than
>>> the disk is capable of. Even when there are 4 compactions running, I see
>>> CPU go to +400% but disk IO is still at 20-25M/s write. I use *nodetool
>>> setcompactionthroughput 0* to disable the compaction throttle but don't
>>> see any difference.
>>>
>>> Does this mean compaction is CPU bound? If so 20M/s is kinda low. Is
>>> there anyway to improve the throughput?
>>>
>>> Thanks.
>>>
>>
>>
>

Re: Unable to locate Solr Configuration file ( Generated using dsetool )

2016-01-18 Thread Sebastian Estevez

You can post it to the server using either curl or dsetool:

http://docs.datastax.com/en/datastax_enterprise/4.8/datastax_enterprise/srch/srchReldCore.html

use the solrconfig and schema options:

OptionSettingsDefaultDescription
schema= path n/a Path of the schema file used for reloading the core
solrconfig= path n/a Path of the solrconfig file used for reloading the core

All the best,


[image: datastax_logo.png] 

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png]  [image:
facebook.png]  [image: twitter.png]
 [image: g+.png]







DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Mon, Jan 18, 2016 at 12:29 PM, Harikrishnan A  wrote:

> Thanks Jack ..
> So how do I customize these resource file.. I mean if I want to add some
> custom fields or to change the default text analyzer etc..
>
> Sent from Yahoo Mail on Android
> 
>
> On Mon, Jan 18, 2016 at 7:50 AM, Jack Krupansky
>  wrote:
> Also, you can (and probably should) use the Solr admin UI console if you
> simply wish to view the generated resource files.
>
> -- Jack Krupansky
>
> On Mon, Jan 18, 2016 at 9:46 AM, Jack Krupansky 
> wrote:
>
>> As per the DSE Search doc: "Resource files are stored in Cassandra
>> database, not in the file system. The schema.xml and solrconfig.xml resources
>> are persisted in the solr_admin.solr_resources database table":
>>
>> http://docs.datastax.com/en/datastax_enterprise/4.8/datastax_enterprise/srch/srchUpld.html
>>
>> Use the dsetool get_core_schema and get_core_config commands to retrieve
>> the generated Solr schema and solrconfig files:
>>
>> http://docs.datastax.com/en/datastax_enterprise/4.8/datastax_enterprise/tools/toolsDsetool.html
>>
>> You can also use the dsetool read_resource command to read any of the
>> Solr resource "files".
>>
>>
>> -- Jack Krupansky
>>
>> On Mon, Jan 18, 2016 at 12:47 AM, Harikrishnan A 
>> wrote:
>>
>>> Hello,
>>>
>>> I have created a solr core with automatic resource generation using the
>>> below command
>>>
>>> > dsetool create_core . generateResources=true 
>>> > reindex=true
>>>
>>> However I am unable to locate the schema.xml and the solrconfig.xml which 
>>> got created for this core.
>>>
>>> What is the default location of these configuration files?
>>> Can I customize these configuration files once it is generated using the
>>> above commands?
>>>
>>> Thanks & Regards,
>>> Hari
>>>
>>
>>
>

Re: Too many compactions, maybe keyspace system?

2016-01-17 Thread Sebastian Estevez

I agree that this may be worth a jira.

Can you clarify this statement?

>>5 keyspaces and about 100 cfs months

How many total empty tables did you create? Creating hundreds of tables is
a bad practice in Cassandra but I was not aware of a compaction impact like
what you're describing.

all the best,

Sebastián
On Jan 16, 2016 4:43 AM, "DuyHai Doan"  wrote:

> Interesting, maybe it worths filing a JIRA. Empty tables should not slow
> down compaction of other tables
>
> On Sat, Jan 16, 2016 at 10:33 AM, Shuo Chen  wrote:
>
>> Hi, Robert,
>>
>> I think I found the cause of the too many compactions. I used jmap to
>> dump the heap and used Eclipse memory analyzer plugin to extract the heap.
>>
>> In previous reply, It shows the there are too many pending jobs in the
>> Blocking queue. I checked the cf of the compaction task object. There are
>> many cfs concerning some empty cfs I created before.
>>
>> I created 5 keyspaces and about 100 cfs months by cassandra-cli ago and
>> didnot put any data yet. In  fact, there is only 1 keypaces I created
>> containing data and the other 5 keyspaces are empty.
>>
>> When I droped these 5 keyspaces and restarted the high compaction node,
>> It runs normally with normal mount of compactions.
>>
>> So maybe there are some bugs of compaction for empty columnfamily?
>>
>> On Wed, Jan 13, 2016 at 2:39 AM, Robert Coli 
>> wrote:
>>
>>> On Mon, Jan 11, 2016 at 9:12 PM, Shuo Chen 
>>> wrote:
>>>
 I have a assumption that, lots of pending compaction tasks jam the
 memory and raise full gc. The full chokes the process and slows down
 compaction. And this causes more pending compaction tasks and more pressure
 on memory.

>>>
>>> The question is why there are so many pending compactions, because your
>>> log doesn't show that much compaction is happening. What keyspaces /
>>> columnfamilies do you expect to be compacting, and how many SSTables do
>>> they contain?
>>>
>>>
 Is there a method to list the concrete details of pending compaction
 tasks?

>>>
>>> Nope.
>>>
>>> For the record, this type of extended operational debugging is often
>>> best carried out interactively on #cassandra on freenode IRC.. :)
>>>
>>> =Rob
>>>
>>
>>
>>
>> --
>> *陈硕* *Shuo Chen*
>> chenatu2...@gmail.com
>> chens...@whaty.com
>>
>
>

Re: compaction throughput

2016-01-15 Thread Sebastian Estevez

LCS is IO ontensive but CPU is also relevant.

On slower disks compaction may not be cpu bound.

If you aren't seeing more than one compaction thread at a time, I suspect
your system is not compaction bound.

all the best,

Sebastián
On Jan 15, 2016 7:20 PM, "Kai Wang"  wrote:

> Sebastian,
>
> Because I have this impression that LCS is IO intensive and it's
> recommended only on SSDs. So I am curious to see how far it can stress
> those SSDs. But it turns out the most expensive part about LCS is not IO
> bound but CUP bound, or more precisely single core speed bound. This is a
> little surprising.
>
> Of course LCS is still superior in other aspects.
> On Jan 15, 2016 6:34 PM, "Sebastian Estevez" <
> sebastian.este...@datastax.com> wrote:
>
>> Correct.
>>
>> Why are you concerned with the raw throughput, are you accumulating
>> pending compactions? Are you seeing high sstables per read statistics?
>>
>> all the best,
>>
>> Sebastián
>> On Jan 15, 2016 6:18 PM, "Kai Wang"  wrote:
>>
>>> Jeff & Sebastian,
>>>
>>> Thanks for the reply. There are 12 cores but in my case C* only uses one
>>> core most of the time. *nodetool compactionstats* shows there's only
>>> one compactor running. I can see C* process only uses one core. So I guess
>>> I should've asked the question more clearly:
>>>
>>> 1. Is ~25 M/s a reasonable compaction throughput for one core?
>>> 2. Is there any configuration that affects single core compaction
>>> throughput?
>>> 3. Is concurrent_compactors the only option to parallelize compaction?
>>> If so, I guess it's the compaction strategy itself that decides when to
>>> parallelize and when to block on one core. Then there's not much we can do
>>> here.
>>>
>>> Thanks.
>>>
>>> On Fri, Jan 15, 2016 at 5:23 PM, Jeff Jirsa 
>>> wrote:
>>>
>>>> With SSDs, the typical recommendation is up to 0.8-1 compactor per core
>>>> (depending on other load).  How many CPU cores do you have?
>>>>
>>>>
>>>> From: Kai Wang
>>>> Reply-To: "user@cassandra.apache.org"
>>>> Date: Friday, January 15, 2016 at 12:53 PM
>>>> To: "user@cassandra.apache.org"
>>>> Subject: compaction throughput
>>>>
>>>> Hi,
>>>>
>>>> I am trying to figure out the bottleneck of compaction on my node. The
>>>> node is CentOS 7 and has SSDs installed. The table is configured to use
>>>> LCS. Here is my compaction related configs in cassandra.yaml:
>>>>
>>>> compaction_throughput_mb_per_sec: 160
>>>> concurrent_compactors: 4
>>>>
>>>> I insert about 10G of data and start observing compaction.
>>>>
>>>> *nodetool compaction* shows most of time there is one compaction.
>>>> Sometimes there are 3-4 (I suppose this is controlled by
>>>> concurrent_compactors). During the compaction, I see one CPU core is 100%.
>>>> At that point, disk IO is about 20-25 M/s write which is much lower than
>>>> the disk is capable of. Even when there are 4 compactions running, I see
>>>> CPU go to +400% but disk IO is still at 20-25M/s write. I use *nodetool
>>>> setcompactionthroughput 0* to disable the compaction throttle but
>>>> don't see any difference.
>>>>
>>>> Does this mean compaction is CPU bound? If so 20M/s is kinda low. Is
>>>> there anyway to improve the throughput?
>>>>
>>>> Thanks.
>>>>
>>>
>>>

Re: compaction throughput

2016-01-15 Thread Sebastian Estevez

Correct.

Why are you concerned with the raw throughput, are you accumulating pending
compactions? Are you seeing high sstables per read statistics?

all the best,

Sebastián
On Jan 15, 2016 6:18 PM, "Kai Wang"  wrote:

> Jeff & Sebastian,
>
> Thanks for the reply. There are 12 cores but in my case C* only uses one
> core most of the time. *nodetool compactionstats* shows there's only one
> compactor running. I can see C* process only uses one core. So I guess I
> should've asked the question more clearly:
>
> 1. Is ~25 M/s a reasonable compaction throughput for one core?
> 2. Is there any configuration that affects single core compaction
> throughput?
> 3. Is concurrent_compactors the only option to parallelize compaction? If
> so, I guess it's the compaction strategy itself that decides when to
> parallelize and when to block on one core. Then there's not much we can do
> here.
>
> Thanks.
>
> On Fri, Jan 15, 2016 at 5:23 PM, Jeff Jirsa 
> wrote:
>
>> With SSDs, the typical recommendation is up to 0.8-1 compactor per core
>> (depending on other load).  How many CPU cores do you have?
>>
>>
>> From: Kai Wang
>> Reply-To: "user@cassandra.apache.org"
>> Date: Friday, January 15, 2016 at 12:53 PM
>> To: "user@cassandra.apache.org"
>> Subject: compaction throughput
>>
>> Hi,
>>
>> I am trying to figure out the bottleneck of compaction on my node. The
>> node is CentOS 7 and has SSDs installed. The table is configured to use
>> LCS. Here is my compaction related configs in cassandra.yaml:
>>
>> compaction_throughput_mb_per_sec: 160
>> concurrent_compactors: 4
>>
>> I insert about 10G of data and start observing compaction.
>>
>> *nodetool compaction* shows most of time there is one compaction.
>> Sometimes there are 3-4 (I suppose this is controlled by
>> concurrent_compactors). During the compaction, I see one CPU core is 100%.
>> At that point, disk IO is about 20-25 M/s write which is much lower than
>> the disk is capable of. Even when there are 4 compactions running, I see
>> CPU go to +400% but disk IO is still at 20-25M/s write. I use *nodetool
>> setcompactionthroughput 0* to disable the compaction throttle but don't
>> see any difference.
>>
>> Does this mean compaction is CPU bound? If so 20M/s is kinda low. Is
>> there anyway to improve the throughput?
>>
>> Thanks.
>>
>
>

Re: compaction throughput

2016-01-15 Thread Sebastian Estevez

 *nodetool setcompactionthroughput 0*

Will only affect future compactions, not the ones that are currently
running.

All the best,


[image: datastax_logo.png] 

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png]  [image:
facebook.png]  [image: twitter.png]
 [image: g+.png]







DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Fri, Jan 15, 2016 at 4:40 PM, Jeff Ferland  wrote:

> Compaction is generally CPU bound and relatively slow. Exactly why that is
> I’m uncertain.
>
> On Jan 15, 2016, at 12:53 PM, Kai Wang  wrote:
>
> Hi,
>
> I am trying to figure out the bottleneck of compaction on my node. The
> node is CentOS 7 and has SSDs installed. The table is configured to use
> LCS. Here is my compaction related configs in cassandra.yaml:
>
> compaction_throughput_mb_per_sec: 160
> concurrent_compactors: 4
>
> I insert about 10G of data and start observing compaction.
>
> *nodetool compaction* shows most of time there is one compaction.
> Sometimes there are 3-4 (I suppose this is controlled by
> concurrent_compactors). During the compaction, I see one CPU core is 100%.
> At that point, disk IO is about 20-25 M/s write which is much lower than
> the disk is capable of. Even when there are 4 compactions running, I see
> CPU go to +400% but disk IO is still at 20-25M/s write. I use *nodetool
> setcompactionthroughput 0* to disable the compaction throttle but don't
> see any difference.
>
> Does this mean compaction is CPU bound? If so 20M/s is kinda low. Is there
> anyway to improve the throughput?
>
> Thanks.
>
>
>

Re: Cassandra 3.1.1 with respect to HeapSpace

2016-01-15 Thread Sebastian Estevez

The recommended (and default when available) heap size for Cassandra is 8GB
and for New size it's 100mb per core.

Your milage may vary based on workload, hardware etc.

There are also some alternative JVM tuning schools of thought. See
cassandra-8150 (large heap) and CASSANDRA-7486 (G1GC).



All the best,


[image: datastax_logo.png] <http://www.datastax.com/>

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png] <https://www.linkedin.com/company/datastax> [image:
facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
<https://twitter.com/datastax> [image: g+.png]
<https://plus.google.com/+Datastax/about>
<http://feeds.feedburner.com/datastax>
<http://goog_410786983>


<http://www.datastax.com/gartner-magic-quadrant-odbms>

DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Fri, Jan 15, 2016 at 4:00 AM, Jean Tremblay <
jean.tremb...@zen-innovations.com> wrote:

> Thank you Sebastián for your useful advice. I managed restarting the
> nodes, but I needed to delete all the commit logs, not only the last one
> specified. Nevertheless I’m back in business.
>
> Would there be a better memory configuration to select for my nodes in a
> C* 3 cluster? Currently I use MAX_HEAP_SIZE=“6G" HEAP_NEWSIZE=“496M” for
> a 16M RAM node.
>
> Thanks for your help.
>
> Jean
>
> On 15 Jan 2016, at 24:24 , Sebastian Estevez <
> sebastian.este...@datastax.com> wrote:
>
>
> Try starting the other nodes. You may have to delete or mv the commitlog
> segment referenced in the error message for the node to come up since
> apparently it is corrupted.
>
> All the best,
>
> [image: datastax_logo.png] <http://www.datastax.com/>
> Sebastián Estévez
> Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com
> [image: linkedin.png] <https://www.linkedin.com/company/datastax> [image:
> facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
> <https://twitter.com/datastax> [image: g+.png]
> <https://plus.google.com/+Datastax/about>
> <http://feeds.feedburner.com/datastax>
> <http://goog_410786983/>
>
> <http://www.datastax.com/gartner-magic-quadrant-odbms>
>
> DataStax is the fastest, most scalable distributed database technology,
> delivering Apache Cassandra to the world’s most innovative enterprises.
> Datastax is built to be agile, always-on, and predictably scalable to any
> size. With more than 500 customers in 45 countries, DataStax is the
> database technology and transactional backbone of choice for the worlds
> most innovative companies such as Netflix, Adobe, Intuit, and eBay.
>
> On Thu, Jan 14, 2016 at 1:00 PM, Jean Tremblay <
> jean.tremb...@zen-innovations.com> wrote:
>
>> How can I restart?
>> It blocks with the error listed below.
>> Are my memory settings good for my configuration?
>>
>> On 14 Jan 2016, at 18:30, Jake Luciani  wrote:
>>
>> Yes you can restart without data loss.
>>
>> Can you please include info about how much data you have loaded per node
>> and perhaps what your schema looks like?
>>
>> Thanks
>>
>> On Thu, Jan 14, 2016 at 12:24 PM, Jean Tremblay <
>> jean.tremb...@zen-innovations.com> wrote:
>>
>>>
>>> Ok, I will open a ticket.
>>>
>>> How could I restart my cluster without loosing everything ?
>>> Would there be a better memory configuration to select for my nodes?
>>> Currently I use MAX_HEAP_SIZE="6G" HEAP_NEWSIZE=“496M” for a 16M RAM node.
>>>
>>> Thanks
>>>
>>> Jean
>>>
>>> On 14 Jan 2016, at 18:19, Tyler Hobbs  wrote:
>>>
>>> I don't think that's a known issue.  Can you open a ticket at
>>> https://issues.apache.org/jira/browse/CASSANDRA and attach your schema
>>> along with the commitlog files and the mutation that was saved to /tmp?
>>>
>>> On Thu, Jan 14, 2016 at 10:56 AM, Jean Tremblay <
>>> jean.tremb...@zen-innovations.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> I have a small Cassandra Cluster with 5 nodes, having 16MB of RAM.
>>>> I use Cassandra 3.1.1.
>>>> I use the following setup for the memory:
>>>>

Re: Cassandra 3.1.1 with respect to HeapSpace

2016-01-14 Thread Sebastian Estevez

Try starting the other nodes. You may have to delete or mv the commitlog
segment referenced in the error message for the node to come up since
apparently it is corrupted.

All the best,


[image: datastax_logo.png] 

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png]  [image:
facebook.png]  [image: twitter.png]
 [image: g+.png]







DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Thu, Jan 14, 2016 at 1:00 PM, Jean Tremblay <
jean.tremb...@zen-innovations.com> wrote:

> How can I restart?
> It blocks with the error listed below.
> Are my memory settings good for my configuration?
>
> On 14 Jan 2016, at 18:30, Jake Luciani  wrote:
>
> Yes you can restart without data loss.
>
> Can you please include info about how much data you have loaded per node
> and perhaps what your schema looks like?
>
> Thanks
>
> On Thu, Jan 14, 2016 at 12:24 PM, Jean Tremblay <
> jean.tremb...@zen-innovations.com> wrote:
>
>>
>> Ok, I will open a ticket.
>>
>> How could I restart my cluster without loosing everything ?
>> Would there be a better memory configuration to select for my nodes?
>> Currently I use MAX_HEAP_SIZE="6G" HEAP_NEWSIZE=“496M” for a 16M RAM node.
>>
>> Thanks
>>
>> Jean
>>
>> On 14 Jan 2016, at 18:19, Tyler Hobbs  wrote:
>>
>> I don't think that's a known issue.  Can you open a ticket at
>> https://issues.apache.org/jira/browse/CASSANDRA and attach your schema
>> along with the commitlog files and the mutation that was saved to /tmp?
>>
>> On Thu, Jan 14, 2016 at 10:56 AM, Jean Tremblay <
>> jean.tremb...@zen-innovations.com> wrote:
>>
>>> Hi,
>>>
>>> I have a small Cassandra Cluster with 5 nodes, having 16MB of RAM.
>>> I use Cassandra 3.1.1.
>>> I use the following setup for the memory:
>>>   MAX_HEAP_SIZE="6G"
>>> HEAP_NEWSIZE="496M"
>>>
>>> I have been loading a lot of data in this cluster over the last 24
>>> hours. The system behaved I think very nicely. It was loading very fast,
>>> and giving excellent read time. There was no error messages until this one:
>>>
>>>
>>> ERROR [SharedPool-Worker-35] 2016-01-14 17:05:23,602
>>> JVMStabilityInspector.java:139 - JVM state determined to be unstable.
>>> Exiting forcefully due to:
>>> java.lang.OutOfMemoryError: Java heap space
>>> at java.nio.HeapByteBuffer.(HeapByteBuffer.java:57) ~[na:1.8.0_65]
>>> at java.nio.ByteBuffer.allocate(ByteBuffer.java:335) ~[na:1.8.0_65]
>>> at
>>> org.apache.cassandra.io.util.DataOutputBuffer.reallocate(DataOutputBuffer.java:126)
>>> ~[apache-cassandra-3.1.1.jar:3.1.1]
>>> at
>>> org.apache.cassandra.io.util.DataOutputBuffer.doFlush(DataOutputBuffer.java:86)
>>> ~[apache-cassandra-3.1.1.jar:3.1.1]
>>> at
>>> org.apache.cassandra.io.util.BufferedDataOutputStreamPlus.write(BufferedDataOutputStreamPlus.java:132)
>>> ~[apache-cassandra-3.1.1.jar:3.1.1]
>>> at
>>> org.apache.cassandra.io.util.BufferedDataOutputStreamPlus.write(BufferedDataOutputStreamPlus.java:151)
>>> ~[apache-cassandra-3.1.1.jar:3.1.1]
>>> at
>>> org.apache.cassandra.utils.ByteBufferUtil.writeWithVIntLength(ByteBufferUtil.java:297)
>>> ~[apache-cassandra-3.1.1.jar:3.1.1]
>>> at
>>> org.apache.cassandra.db.marshal.AbstractType.writeValue(AbstractType.java:374)
>>> ~[apache-cassandra-3.1.1.jar:3.1.1]
>>> at
>>> org.apache.cassandra.db.rows.BufferCell$Serializer.serialize(BufferCell.java:263)
>>> ~[apache-cassandra-3.1.1.jar:3.1.1]
>>> at
>>> org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:183)
>>> ~[apache-cassandra-3.1.1.jar:3.1.1]
>>> at
>>> org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:108)
>>> ~[apache-cassandra-3.1.1.jar:3.1.1]
>>> at
>>> org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:96)
>>> ~[apache-cassandra-3.1.1.jar:3.1.1]
>>> at
>>> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:132)
>>> ~[apache-cassandra-3.1.1.jar:3.1.1]
>>> at
>>> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:87)
>>> ~[apache-cassandra-3.1.1.jar:3.1.1]
>>> at
>>> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:77)
>>> ~[apache-cassandra-3.1.1.jar:3.1.1]
>>> at
>>> org.apache.

Re: Re: cassandra full gc too long

2015-12-29 Thread Sebastian Estevez

Hi Xutom,

What the " modern CQL client with paging support" mean? Is there opensource
> CQL client ?  I does not use any opensource CQL client and exporting data
> with my java code.

Use the datastax Java driver which is open source:

github:
https://github.com/datastax/java-driver

docs:
http://datastax.github.io/java-driver/

I have splitted that table into 47*20=940 parts, I have 47 partitions and
> each partition also has 20 buckets, so everytime I execute such cql: select
> * from table where partition_id=a and bucket_id=b, and the number of each
> select result maybe 40-80 million.

I don't know what your data model looks like but it sounds (from your
description) that your partitions are too large. Check out my blog post on
data modeling and benchmarking:

http://www.sestevez.com/data-modeler/

Run stress on your test setup to gain confidence in your data model and to
ensure and benefit from predictable scalability.

All the best,

[image: datastax_logo.png] 

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png]  [image:
facebook.png]  [image: twitter.png]
 [image: g+.png]

DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Tue, Dec 29, 2015 at 2:04 AM, xutom  wrote:

>
> Thanks for ur reply. Exactly I have splitted that table into 47*20=940
> parts, I have 47 partitions and each partition also has 20 buckets, so
> everytime I execute such cql: select * from table where partition_id=a and
> bucket_id=b, and the number of each select result maybe 40-80 million.
> What the " modern CQL client with paging support" mean? Is there
> opensource CQL client ?  I does not use any opensource CQL client and
> exporting data with my java code.
>
>
> 在 2015-12-29 11:38:51，"Robert Coli"  写道：
>
> On Mon, Dec 28, 2015 at 5:57 PM, xutom  wrote:
>
>> I have 5 nodes in my C* cluster, and each node has the same
>> configuration file(Cassandra-env.sh: MAX_HEAP_SIZE="32G" and
>> HEAP_NEWSIZE="8G"), and My Cassandra version is 2.1.1. Now I want to
>> export all data of one table, i am using  select * from tablename,
>>
>
> Probably lower your heap size. If you're using CMS GC with 32gb heap you
> will get long GC pauses.
>
> Also use a modern CQL client with paging support.
>
> In addition, upgrade to the head of 2.1.x, 2.1.1 is not a version anyone
> should be using in production at this time.
>
> =Rob
>
>
>
>
>

Re: OpsCenter metrics growth can relates to compactions?

2015-12-21 Thread Sebastian Estevez

>
> We do have a lot of keyspaces and column families.

Be careful as c* (not just opscenter) will not run well with too many
tables. Usually 2 or 3 hundred is a good upper bound though I've seen folks
throw money at the problem and run more with special hardware (lots of RAM).

Most importantly, I truncated all rollups early this morning and during a
> big compaction (with hundreds of pending tasks at one point), the metrics
> grew to ~13G. Can I say compaction activities can increase the metric disk
> usage growth significantly? I have seen this behavior quite often with
> compaction.

This is normal especially with size tiered compaction.

Since it's metrics data, you can always decrease your ttl on the OpsCenter
tables, blacklist some keyspaces or tables, or keep truncating.

http://docs.datastax.com/en/opscenter/5.2/opsc/configure/opscMetricsConfig_r.html

All the best,

[image: datastax_logo.png] 

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png]  [image:
facebook.png]  [image: twitter.png]
 [image: g+.png]

DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Sun, Dec 20, 2015 at 3:45 PM, John Wong  wrote:

> Hi.
>
> We are using the open source version of OpsCenter. We find it useful, but
> the disk space for OpsCenter metrics has been increasing and can sometime
> outgrow to 30-50G in a matter of a day or two. We do have a lot of
> keyspaces and column families.
>
> Usually this dev cluster is quiet on the weekend except for some QA jobs
> or our weekend primary read-repair. Most importantly, I truncated all
> rollups early this morning and during a big compaction (with hundreds of
> pending tasks at one point), the metrics grew to ~13G. Can I say compaction
> activities can increase the metric disk usage growth significantly? I have
> seen this behavior quite often with compaction.
>
> Thanks.
>
> John
>

Re: Rebuilding a new Cassandra node at 100Mb/s

2015-12-04 Thread Sebastian Estevez

If you change stream throughput it won't affect currently running streams
but it should affect new ones.

all the best,

Sebastián
On Dec 4, 2015 5:39 AM, "Jonathan Ballet"  wrote:

> Thanks for your answer Rob,
>
> On 12/03/2015 08:32 PM, Robert Coli wrote:
>
>> On Thu, Dec 3, 2015 at 7:51 AM, Jonathan Ballet > > wrote:
>>
>> I noticed it's not really fast and my monitoring system shows that
>> the traffic incoming on this node is exactly at 100Mb/s (12.6MB/s).
>> I know it can be much more than that (I just tested sending a file
>> through SSH between the two machines and it goes up to 1Gb/s), is
>> there a limitation of some sort on Cassandra which limit the
>> transfer rate to 100Mb/s?
>>
>>
>> Probably limited by number of simultaneous parallel streams. Many people
>> do not want streams to go "as fast as possible" because their priority
>> is maintaining baseline service times while rebuilding/bootstrapping.
>>
>> Not sure there's a way to tune it, but this is definitely on the "large
>> node" radar..
>>
>
> I was actually a bit surprised that the limit seems to really be capped at
> 100 Mb/s, not more not less. So I was thinking there was something else
> playing here...
>
>  Jonathan
>

Re: cassandra-stress 2.1: Generating data

2015-12-03 Thread Sebastian Estevez

You can run stress from a separate machine to isolate that impact.

All the best,


[image: datastax_logo.png] 

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png]  [image:
facebook.png]  [image: twitter.png]
 [image: g+.png]







DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Thu, Dec 3, 2015 at 10:55 AM, Jake Luciani  wrote:

> The data is only being inserted from gen01
>
> On Thu, Dec 3, 2015 at 10:52 AM,  wrote:
>
>> Hi,
>>
>>
>>
>> I’m trying to insert data with Cassandra-stress into cluster C* with 6
>> nodes: *node001….006*
>>
>>
>>
>> The stress-tool is executed on a different machine (*gen01*) specifying
>> one of 6 nodes: tools/bin/cassandra-stress  user profile=cf.yml
>> ops\(insert=1\) n=500 -mode thrift -node node001  -rate threads=50
>>
>>
>>
>> My question : The data generation of data is done on  gen01 and then
>> inserted on nodes Cassandra OR ALL (generation and insertion) is running on
>> nodes Cassandra ?
>>
>>
>>
>> Thanks.
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> _
>>
>> Ce message et ses pieces jointes peuvent contenir des informations 
>> confidentielles ou privilegiees et ne doivent donc
>> pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu 
>> ce message par erreur, veuillez le signaler
>> a l'expediteur et le detruire ainsi que les pieces jointes. Les messages 
>> electroniques etant susceptibles d'alteration,
>> France Telecom - Orange decline toute responsabilite si ce message a ete 
>> altere, deforme ou falsifie. Merci
>>
>> This message and its attachments may contain confidential or privileged 
>> information that may be protected by law;
>> they should not be distributed, used or copied without authorization.
>> If you have received this email in error, please notify the sender and 
>> delete this message and its attachments.
>> As emails may be altered, France Telecom - Orange shall not be liable if 
>> this message was modified, changed or falsified.
>> Thank you.
>>
>>
>
>
> --
> http://twitter.com/tjake
>

Re: Upgrade instructions don't make sense

2015-11-23 Thread Sebastian Estevez

We are happy to clarify the docs.

I'm ccing docs@datastax.

Thanks!

all the best,

Sebastián
On Nov 23, 2015 6:48 PM, "Jacob Hathaway"  wrote:

> Actually Sebastian, saying one, does not always imply the other. And when
> it says this:
>
> In Cassandra 2.0.x, virtual nodes (vnodes) are enabled by default. Disable
> vnodes in the 2.0.x version before upgrading.
>
> That implies to me to disable vnodes no matter what.
>
> How do we get the docs fixed? I would suggest 2 things. First, more
> comprehensive in the instructions, especially for upgrades. Second, fixing
> little things like broken links, etc, etc.
>
> Thanks
> Jake Hathaway
> Cassandra Beginner
>
>
> On Nov 23, 2015, at 4:22 PM, Sebastian Estevez <
> sebastian.este...@datastax.com> wrote:
>
> If your cluster does not use vnodes, disable vnodes in each new
>> cassandra.yaml
>
>
> If your cluster *does* use vnodes do *not* disable them.
>
> All the best,
>
> [image: datastax_logo.png] <http://www.datastax.com/>
> Sebastián Estévez
> Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com
> [image: linkedin.png] <https://www.linkedin.com/company/datastax> [image:
> facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
> <https://twitter.com/datastax> [image: g+.png]
> <https://plus.google.com/+Datastax/about>
> <http://feeds.feedburner.com/datastax>
> <http://goog_410786983/>
>
> <http://www.datastax.com/gartner-magic-quadrant-odbms>
>
> DataStax is the fastest, most scalable distributed database technology,
> delivering Apache Cassandra to the world’s most innovative enterprises.
> Datastax is built to be agile, always-on, and predictably scalable to any
> size. With more than 500 customers in 45 countries, DataStax is the
> database technology and transactional backbone of choice for the worlds
> most innovative companies such as Netflix, Adobe, Intuit, and eBay.
>
> On Mon, Nov 23, 2015 at 5:55 PM, Robert Wille  wrote:
>
>> I’m wanting to upgrade from 2.0 to 2.1. The upgrade instructions at
>> http://docs.datastax.com/en/upgrade/doc/upgrade/cassandra/upgradeCassandraDetails.html
>>  has
>> the following, which leaves me with more questions than it answers:
>>
>> If your cluster does not use vnodes, disable vnodes in each new
>> cassandra.yaml before doing the rolling restart.
>> In Cassandra 2.0.x, virtual nodes (vnodes) are enabled by default.
>> Disable vnodes in the 2.0.x version before upgrading.
>>
>>1. In the cassandra.yaml
>>
>> <http://docs.datastax.com/en/upgrade/doc/upgrade/cassandra/upgradeCassandraDetails.html#upgradeCassandraDetails__cassandrayaml_unique_7>
>>  file,
>>set num_tokens to 1.
>>2. Uncomment the initial_token property and set it to 1 or to the
>>value of a generated token
>>
>> <http://docs.datastax.com/en/cassandra/2.1/cassandra/configuration/configGenTokens_c.html>
>>  for
>>a multi-node cluster.
>>
>>
>> It seems strange that vnodes has to be disabled to upgrade, but whatever.
>> If I use an initial token generator to set the initial_token property of
>> each node, then I assume that my token ranges are all going to change, and
>> that there’s going to be a whole bunch of streaming as the data is shuffled
>> around. The docs don’t mention that. Should I wait until the streaming is
>> done before proceeding with the upgrade?
>>
>> The docs don’t talk about vnodes and initial_tokens post-upgrade. Can I
>> turn vnodes back on? Am I forever after stuck with having to have manually
>> generated initial tokens (and needing to have a unique cassandra.yaml for
>> every node)? Can I just set num_tokens = 256 and comment out initial_token
>> and do a rolling restart?
>>
>> Thanks in advance
>>
>> Robert
>>
>>
>
>

Re: Upgrade instructions don't make sense

2015-11-23 Thread Sebastian Estevez

>
> If your cluster does not use vnodes, disable vnodes in each new
> cassandra.yaml


If your cluster *does* use vnodes do *not* disable them.

All the best,


[image: datastax_logo.png] 

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png]  [image:
facebook.png]  [image: twitter.png]
 [image: g+.png]







DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Mon, Nov 23, 2015 at 5:55 PM, Robert Wille  wrote:

> I’m wanting to upgrade from 2.0 to 2.1. The upgrade instructions at
> http://docs.datastax.com/en/upgrade/doc/upgrade/cassandra/upgradeCassandraDetails.html
>  has
> the following, which leaves me with more questions than it answers:
>
> If your cluster does not use vnodes, disable vnodes in each new
> cassandra.yaml before doing the rolling restart.
> In Cassandra 2.0.x, virtual nodes (vnodes) are enabled by default. Disable
> vnodes in the 2.0.x version before upgrading.
>
>1. In the cassandra.yaml
>
> 
>  file,
>set num_tokens to 1.
>2. Uncomment the initial_token property and set it to 1 or to the
>value of a generated token
>
> 
>  for
>a multi-node cluster.
>
>
> It seems strange that vnodes has to be disabled to upgrade, but whatever.
> If I use an initial token generator to set the initial_token property of
> each node, then I assume that my token ranges are all going to change, and
> that there’s going to be a whole bunch of streaming as the data is shuffled
> around. The docs don’t mention that. Should I wait until the streaming is
> done before proceeding with the upgrade?
>
> The docs don’t talk about vnodes and initial_tokens post-upgrade. Can I
> turn vnodes back on? Am I forever after stuck with having to have manually
> generated initial tokens (and needing to have a unique cassandra.yaml for
> every node)? Can I just set num_tokens = 256 and comment out initial_token
> and do a rolling restart?
>
> Thanks in advance
>
> Robert
>
>

Re: Help diagnosing performance issue

2015-11-18 Thread Sebastian Estevez

>
> When you say drop you mean reduce the value (to 1 day for example), not
> "don't set the value", right ?


Yes.

If I set max sstable age days to 1, my understanding is that SSTables with
> expired data (5 days) are not going to be compacted ever. And therefore my
> disk usage will keep growing forever. Did I miss something here ?


We will expire sstables who's highest TTL is beyond gc_grace_seconds as of
CASSANDRA-5228 <https://issues.apache.org/jira/browse/CASSANDRA-5228>. This
is nice because the sstable is just dropped for free, no need to scan it
and remove tombstones which is very expensive and DTCS will guarantee that
all the data within an sstable is close together in time.

So, if I set max sstable age days to 1, I have to run repairs at least once
> a day, correct ?

I'm afraid I don't get your point about painful compactions.


I was referring to the problems described here CASSANDRA-9644
<https://issues.apache.org/jira/browse/CASSANDRA-9644>




All the best,


[image: datastax_logo.png] <http://www.datastax.com/>

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png] <https://www.linkedin.com/company/datastax> [image:
facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
<https://twitter.com/datastax> [image: g+.png]
<https://plus.google.com/+Datastax/about>
<http://feeds.feedburner.com/datastax>
<http://goog_410786983>


<http://www.datastax.com/gartner-magic-quadrant-odbms>

DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Wed, Nov 18, 2015 at 5:53 PM, Antoine Bonavita 
wrote:

> Sebastian,
>
> Your help is very much appreciated. I re-read the blog post and also
> https://labs.spotify.com/2014/12/18/date-tiered-compaction/ but some
> things are still confusing me.
>
> Please see my questions inline below.
>
> On 11/18/2015 04:21 PM, Sebastian Estevez wrote:
>
>> Yep, I think you've mixed up your DTCS levers. I would read, or re-read
>> Marcus's post
>> http://www.datastax.com/dev/blog/datetieredcompactionstrategy
>>
>> *TL;DR:*
>>
>>   * *base_time_seconds*  is the size of your initial window
>>   * *max_sstable_age_days* is the time after which you stop compacting
>> sstables
>>   * *default_time_to_live* is the time after which data expires and
>> sstables will start to become available for GC. (432000 is 5 days)
>>
>>
>> Could it be that compaction is putting those in cache constantly?
>>
>>
>> Yep, you'll keep compacting sstables until they're 10 days old per your
>> current settings and when you compact there are reads and then writes.
>>
>>
>>
>> If you aren't doing any updates and most of your reads are within 1
>> hour, you can probably afford to drop max sstable age days.
>>
> When you say drop you mean reduce the value (to 1 day for example), not
> "don't set the value", right ?
>
> If I set max sstable age days to 1, my understanding is that SSTables with
> expired data (5 days) are not going to be compacted ever. And therefore my
> disk usage will keep growing forever. Did I miss something here ?
>
> Just make
>> sure you're doing your repairs more often than the max sstable age days
>> to avoid some painful compactions.
>>
> So, if I set max sstable age days to 1, I have to run repairs at least
> once a day, correct ?
> I'm afraid I don't get your point about painful compactions.
>
> Along the same lines, you should probably set dclocal_read_repair_chance
>> to 0
>>
> Will try that.
>
>
> Regarding the heap configuration, both are very similar
>>
>>
>> Probably unrelated but, is there a reason why they're not identical?
>> Especially the different new gen size could have gc implications.
>>
> Both are calculated by cassandra-env.sh. If my bash skills are still
> intact, the NewGen size difference comes from the number of cores: the 64G
> machine has 12 cores where the 32G machine has 8 cores (I did not even
> realize this before looking into this, that's why I did not mention it in
> my previous emails).
>
> Thanks a lot for your help.
>
> A.
>
>
>>
>>
>>
>> All the best,
>>
>>
&g

Re: Help diagnosing performance issue

2015-11-18 Thread Sebastian Estevez

Yep, I think you've mixed up your DTCS levers. I would read, or re-read
Marcus's post http://www.datastax.com/dev/blog/datetieredcompactionstrategy

*TL;DR:*

   - *base_time_seconds*  is the size of your initial window
   - *max_sstable_age_days* is the time after which you stop compacting
   sstables
   - *default_time_to_live* is the time after which data expires and
   sstables will start to become available for GC. (432000 is 5 days)


Could it be that compaction is putting those in cache constantly?


Yep, you'll keep compacting sstables until they're 10 days old per your
current settings and when you compact there are reads and then writes.



If you aren't doing any updates and most of your reads are within 1 hour,
you can probably afford to drop max sstable age days. Just make sure you're
doing your repairs more often than the max sstable age days to avoid some
painful compactions.

Along the same lines, you should probably set dclocal_read_repair_chance to
0





Regarding the heap configuration, both are very similar


Probably unrelated but, is there a reason why they're not identical?
Especially the different new gen size could have gc implications.




All the best,


[image: datastax_logo.png] <http://www.datastax.com/>

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png] <https://www.linkedin.com/company/datastax> [image:
facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
<https://twitter.com/datastax> [image: g+.png]
<https://plus.google.com/+Datastax/about>
<http://feeds.feedburner.com/datastax>
<http://goog_410786983>


<http://www.datastax.com/gartner-magic-quadrant-odbms>

DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Wed, Nov 18, 2015 at 6:44 AM, Antoine Bonavita 
wrote:

> Sebastian, Robet,
>
> First, a big thank you to both of you for your help.
>
> It looks like you were right. I used pcstat (awesome tool, thanks for that
> as well) and it appears some files I would not expect to be in cache
> actually are. Here is a sample of my output (edited for convenience, adding
> the file timestamp from the OS):
>
> *
> /var/lib/cassandra/data/views/views-451e4d8061ef11e5896f091196a360a0/la-5951-big-Data.db
> - 000.619 % - Nov 16 12:25
> *
> /var/lib/cassandra/data/views/views-451e4d8061ef11e5896f091196a360a0/la-5954-big-Data.db
> - 000.681 % - Nov 16 13:44
> *
> /var/lib/cassandra/data/views/views-451e4d8061ef11e5896f091196a360a0/la-5955-big-Data.db
> -  000.610 % - Nov 16 14:11
> *
> /var/lib/cassandra/data/views/views-451e4d8061ef11e5896f091196a360a0/la-5956-big-Data.db
> - 015.621 % - Nov 16 14:26
> *
> /var/lib/cassandra/data/views/views-451e4d8061ef11e5896f091196a360a0/la-5957-big-Data.db
> - 015.558 % - Nov 16 14:50
>
> The SSTables that come before are all at about 0% and the ones that come
> after it are all at about 15%.
>
> As you can see the first SSTable at 15% date back from 24h. Given my
> application I'm pretty sure those are not from the reads (reads of data
> older than 1h is definitely under 0.1% of reads). Could it be that
> compaction is putting those in cache constantly ?
> If so, then I'm probably confused on the meaning/effect of
> max_sstable_age_days (set at 10 in my case) and base_time_seconds (not set
> in my case so the default of 3600 applies). I would not expect any
> compaction to happen beyond the first hour and the 10 days is here to make
> sure data still gets expired and SSTables removed (thus releasing disk
> space). I don't see where the 24h come from.
> If you guys can shed some light on this, it would be awesome. I'm sure I
> got something wrong.
>
> Regarding the heap configuration, both are very similar:
> * 32G machine: -Xms8049M -Xmx8049M -Xmn800M
> * 64G machine: -Xms8192M -Xmx8192M -Xmn1200M
> I think we can rule that out.
>
> Thanks again for you help, I truly appreciate it.
>
> A.
>
> On 11/17/2015 08:48 PM, Robert Coli wrote:
>
>> On Tue, Nov 17, 2015 at 11:08 AM, Sebastian Estevez
>> mailto:sebastian.este...@datastax.com>>
>> wrote:
>>
>> You're sstables are probably falling out of page cache on the
>> smaller nodes and your slow disks are killing your latencies.
>>
>>
>> +1 most likely.
>>
>> Are the heaps the same size on both machines?
>>
>> =Rob
>>
>
> --
> Antoine Bonavita (anto...@stickyads.tv) - CTO StickyADS.tv
> Tel: +33 6 34 33 47 36/+33 9 50 68 21 32
> NEW YORK | LONDON | HAMBURG | PARIS | MONTPELLIER | MILAN | MADRID
>

Re: Help diagnosing performance issue

2015-11-17 Thread Sebastian Estevez

Hi,

You're sstables are probably falling out of page cache on the smaller nodes
and your slow disks are killing your latencies.

Check to see if this is the case with pcstat:

https://github.com/tobert/pcstat


All the best,


[image: datastax_logo.png] 

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png]  [image:
facebook.png]  [image: twitter.png]
 [image: g+.png]







DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Tue, Nov 17, 2015 at 1:33 PM, Antoine Bonavita 
wrote:

> Hello,
>
> As I have not heard from anybody on the list, I guess I did not provide
> the right kind of information or I did not ask the right question.
>
> The things I forgot to mention in my previous email:
> * Checked the logs without noticing anything out of the ordinary.
> Memtables flushes occur every few minutes.
> * The compaction has been set to allow only one compaction at a time.
> Compaction throughput is the default.
>
> My question is really: where should I look to investigate deeper ?
> I did a lot of reading and watching datastax videos over the past week and
> I don't understand what could explain this behavior.
>
> Or maybe my expectations are too high. But I was under the impression that
> this kind of workload (heavy writes) was the sweet spot for Cassandra and
> that a node should be able to sustain 10K writes per second without
> breaking a sweat.
>
> Any help is appreciated. Much like any direction on what I should do to
> get help.
>
> Thanks,
>
> Antoine.
>
>
> On 11/16/2015 10:04 AM, Antoine Bonavita wrote:
>
>> Hello,
>>
>> We have a performance problem when trying to ramp up cassandra (as a
>> mongo replacement) on a very specific use case. We store a blob indexed
>> by a key and expire it after a few days:
>>
>> CREATE TABLE views.views (
>>  viewkey text PRIMARY KEY,
>>  value blob
>> ) WITH bloom_filter_fp_chance = 0.01
>>  AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
>>  AND comment = ''
>>  AND compaction = {'max_sstable_age_days': '10', 'class':
>> 'org.apache.cassandra.db.compaction.DateTieredCompactionStrategy'}
>>  AND compression = {'sstable_compression':
>> 'org.apache.cassandra.io.compress.LZ4Compressor'}
>>  AND dclocal_read_repair_chance = 0.1
>>  AND default_time_to_live = 432000
>>  AND gc_grace_seconds = 172800
>>  AND max_index_interval = 2048
>>  AND memtable_flush_period_in_ms = 0
>>  AND min_index_interval = 128
>>  AND read_repair_chance = 0.0
>>  AND speculative_retry = '99.0PERCENTILE';
>>
>> Our workload is mostly writes (approx. 96 writes for 4 reads). Each
>> value is about 3kB. reads are mostly for "fresh" data (ie data that was
>> written recently).
>>
>> I have a 4 nodes cluster with spinning disks and a replication factor of
>> 3. For some historical reason 2 of the machines have 32G of RAM and the
>> other 2 have 64G.
>>
>> This is for the context.
>>
>> Now, when I use this cluster at about 600 writes per second per node
>> everything is fine but when I try to ramp it up (1200 writes per second
>> per node) the read latencies are fine on the 64G machines but start
>> going crazy on the 32G machines. When looking at disk iops, this is
>> clearly related:
>> * On 32G machines, read iops go from 200 to 1400.
>> * On 64G machines, read iops go from 10 to 20.
>>
>> So I thought this was related to the Memtable being flushed "too early"
>> on 32G machines. I increased memtable_heap_space_in_mb to 4G on the 32G
>> machines but it did not change anything.
>>
>> At this point I'm kind of lost and could use any help in understanding
>> why I'm generating so many read iops on the 32G machines compared to the
>> 64G one and why it goes crazy (x7) when I merely double the load.
>>
>> Thanks,
>>
>> A.
>>
>>
> --
> Antoine Bonavita (anto...@stickyads.tv) - CTO StickyADS.tv
> Tel: +33 6 34 33 47 36/+33 9 50 68 21 32
> NEW YORK | LONDON | HAMBURG | PARIS | MONTPELLIER | MILAN | MADRID
>

Re: UnknownColumnFamily exception / schema inconsistencies

2015-11-13 Thread Sebastian Estevez

I think you're just missing the steps in *Bold*:


If THERE ARE TWO OR MORE DIRECTORIES:

4)Identify from schema_column_families which cf ID is the "new" one
(currently in use).

cqlsh -e "select * from system.schema_column_families"|grep 

*5) Move the data from the "old" one to the "new" one and remove the
old directory. *

*6) If there are multiple "old" ones repeat 5 for every "old" directory.*

*7) run nodetool refresh*



All the best,


[image: datastax_logo.png] <http://www.datastax.com/>

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png] <https://www.linkedin.com/company/datastax> [image:
facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
<https://twitter.com/datastax> [image: g+.png]
<https://plus.google.com/+Datastax/about>
<http://feeds.feedburner.com/datastax>
<http://goog_410786983>


<http://www.datastax.com/gartner-magic-quadrant-odbms>

DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Fri, Nov 13, 2015 at 12:37 PM, Maciek Sakrejda  wrote:

> Any advice on how to proceed here? Sebastian seems to have guessed
> correctly at the underlying issue, but I'm still not sure how to resolve
> this given what I see in the data directory and the catalogs.
>
> On Wed, Nov 11, 2015 at 12:15 PM, Maciek Sakrejda 
> wrote:
>
>> On Wed, Nov 11, 2015 at 9:55 AM, Sebastian Estevez <
>> sebastian.este...@datastax.com> wrote:
>>
>>> Stupid question, but how do I find the problem table? The error message
>>>> complains about a keyspace (by uuid); I haven't seen errors relating to a
>>>> specific table. I've poked around in the data directory, but I'm not sure
>>>> what I'm looking for.
>>>
>>>
>>> Is the message complaining about a *keyspace* or abou*t a table (cfid)*?
>>> You'r original was complaining about a table:
>>>
>>
>>> at=IncomingTcpConnection.run UnknownColumnFamilyException reading from
>>>> socket; closing org.apache.cassandra.db.UnknownColumnFamilyException:
>>>> Couldn't find *cfId=3ecce750-84d3-11e5-bdd9-**dd7717dcdbd5*
>>>
>>>
>> Sorry, you're absolutely right--it's the table from this error message. I
>> confused myself. But now I was able to find it:
>>
>> cursors-3ecce75084d311e5bdd9dd7717dcdbd5
>> cursors-3ed23e8084d311e583b30fc0205655f5
>>
>> The second uuid is the one that shows up via the schema_columnfamilies
>> query, but on two of the nodes, the directory with the *other* uuid exists.
>> Can I just rename the directory on these two nodes? Or how should I proceed?
>>
>
>

Re: UnknownColumnFamily exception / schema inconsistencies

2015-11-11 Thread Sebastian Estevez

>
> Stupid question, but how do I find the problem table? The error message
> complains about a keyspace (by uuid); I haven't seen errors relating to a
> specific table. I've poked around in the data directory, but I'm not sure
> what I'm looking for.


Is the message complaining about a *keyspace* or abou*t a table (cfid)*?
You'r original was complaining about a table:

at=IncomingTcpConnection.run UnknownColumnFamilyException reading from
> socket; closing org.apache.cassandra.db.UnknownColumnFamilyException:
> Couldn't find *cfId=3ecce750-84d3-11e5-bdd9-**dd7717dcdbd5*


All the best,


[image: datastax_logo.png] <http://www.datastax.com/>

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png] <https://www.linkedin.com/company/datastax> [image:
facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
<https://twitter.com/datastax> [image: g+.png]
<https://plus.google.com/+Datastax/about>
<http://feeds.feedburner.com/datastax>
<http://goog_410786983>


<http://www.datastax.com/gartner-magic-quadrant-odbms>

DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Wed, Nov 11, 2015 at 12:26 PM, Maciek Sakrejda  wrote:

> On Tue, Nov 10, 2015 at 3:20 PM, Sebastian Estevez <
> sebastian.este...@datastax.com> wrote:
>
>> #1 The cause of this problem is a CREATE TABLE statement collision. Do not 
>> generate tables
>> dynamically from multiple clients, even with IF NOT EXISTS. First thing you 
>> need to do is
>> fix your code so that this does not happen. Just create your tables manually 
>> from cqlsh allowing
>> time for the schema to settle.
>>
>> #2 Here's the fix:
>>
>> 1) Change your code to not automatically re-create tables (even with IF NOT 
>> EXISTS).
>>
>> 2) Run a rolling restart to ensure schema matches across nodes. Run nodetool 
>> describecluster
>>
>> around your cluster. Check that there is only one schema version.
>>
>> Thanks, that seems to have resolved the schema version inconsistency
> (though I'm still getting the original error).
>
>> ON EACH NODE:
>>
>> 3) Check your filesystem and see if you have two directories for the table in
>>
>> question in the data directory.
>>
>> Stupid question, but how do I find the problem table? The error message
> complains about a keyspace (by uuid); I haven't seen errors relating to a
> specific table. I've poked around in the data directory, but I'm not sure
> what I'm looking for.
>
>

Re: Cassandra compaction stuck? Should I disable?

2015-11-11 Thread Sebastian Estevez

Use 'nodetool compactionhistory'

all the best,

Sebastián
On Nov 11, 2015 3:23 AM, "PenguinWhispererThe ." <
th3penguinwhispe...@gmail.com> wrote:

> Does compactionstats shows only stats for completed compactions (100%)? It
> might be that the compaction is running constantly, over and over again.
> In that case I need to know what I might be able to do to stop this
> constant compaction so I can start a nodetool repair.
>
> Note that there is a lot of traffic on this columnfamily so I'm not sure
> if temporary disabling compaction is an option. The repair will probably
> take long as well.
>
> Sebastian and Rob: do you might have any more ideas about the things I put
> in this thread? Any help is appreciated!
>
> 2015-11-10 20:03 GMT+01:00 PenguinWhispererThe . <
> th3penguinwhispe...@gmail.com>:
>
>> Hi Sebastian,
>>
>> Thanks for your response.
>>
>> No swap is used. No offense, I just don't see a reason why having swap
>> would be the issue here. I put swapiness on 1. I also have jna installed.
>> That should prevent java being swapped out as wel AFAIK.
>>
>>
>> 2015-11-10 19:50 GMT+01:00 Sebastian Estevez <
>> sebastian.este...@datastax.com>:
>>
>>> Turn off Swap.
>>>
>>>
>>> http://docs.datastax.com/en/cassandra/2.1/cassandra/install/installRecommendSettings.html?scroll=reference_ds_sxl_gf3_2k__disable-swap
>>>
>>>
>>> All the best,
>>>
>>>
>>> [image: datastax_logo.png] <http://www.datastax.com/>
>>>
>>> Sebastián Estévez
>>>
>>> Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com
>>>
>>> [image: linkedin.png] <https://www.linkedin.com/company/datastax> [image:
>>> facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
>>> <https://twitter.com/datastax> [image: g+.png]
>>> <https://plus.google.com/+Datastax/about>
>>> <http://feeds.feedburner.com/datastax>
>>> <http://goog_410786983>
>>>
>>>
>>> <http://www.datastax.com/gartner-magic-quadrant-odbms>
>>>
>>> DataStax is the fastest, most scalable distributed database technology,
>>> delivering Apache Cassandra to the world’s most innovative enterprises.
>>> Datastax is built to be agile, always-on, and predictably scalable to any
>>> size. With more than 500 customers in 45 countries, DataStax is the
>>> database technology and transactional backbone of choice for the worlds
>>> most innovative companies such as Netflix, Adobe, Intuit, and eBay.
>>>
>>> On Tue, Nov 10, 2015 at 1:48 PM, PenguinWhispererThe . <
>>> th3penguinwhispe...@gmail.com> wrote:
>>>
>>>> I also have the following memory usage:
>>>> [root@US-BILLINGDSX4 cassandra]# free -m
>>>>  total   used   free sharedbuffers
>>>> cached
>>>> Mem: 12024   9455   2569  0110
>>>> 2163
>>>> -/+ buffers/cache:   7180   4844
>>>> Swap: 2047  0   2047
>>>>
>>>> Still a lot free and a lot of free buffers/cache.
>>>>
>>>> 2015-11-10 19:45 GMT+01:00 PenguinWhispererThe . <
>>>> th3penguinwhispe...@gmail.com>:
>>>>
>>>>> Still stuck with this. However I enabled GC logging. This shows the
>>>>> following:
>>>>>
>>>>> [root@myhost cassandra]# tail -f gc-1447180680.log
>>>>> 2015-11-10T18:41:45.516+: 225.428: [GC
>>>>> 2721842K->2066508K(6209536K), 0.0199040 secs]
>>>>> 2015-11-10T18:41:45.977+: 225.889: [GC
>>>>> 2721868K->2066511K(6209536K), 0.0221910 secs]
>>>>> 2015-11-10T18:41:46.437+: 226.349: [GC
>>>>> 2721871K->2066524K(6209536K), 0.0222140 secs]
>>>>> 2015-11-10T18:41:46.897+: 226.809: [GC
>>>>> 2721884K->2066539K(6209536K), 0.0224140 secs]
>>>>> 2015-11-10T18:41:47.359+: 227.271: [GC
>>>>> 2721899K->2066538K(6209536K), 0.0302520 secs]
>>>>> 2015-11-10T18:41:47.821+: 227.733: [GC
>>>>> 2721898K->2066557K(6209536K), 0.0280530 secs]
>>>>> 2015-11-10T18:41:48.293+: 228.205: [GC
>>>>> 2721917K->2066571K(6209536K), 0.0218000 secs]
>>>>> 2015-11-10T18:41:48.790+: 228.702: [GC
>>>>> 2721931K->2066780K(6209536K), 0.0292470 secs]
>>&g

Re: UnknownColumnFamily exception / schema inconsistencies

2015-11-10 Thread Sebastian Estevez

#1 The cause of this problem is a CREATE TABLE statement collision. Do
not generate tables
dynamically from multiple clients, even with IF NOT EXISTS. First
thing you need to do is
fix your code so that this does not happen. Just create your tables
manually from cqlsh allowing
time for the schema to settle.

#2 Here's the fix:

1) Change your code to not automatically re-create tables (even with
IF NOT EXISTS).

2) Run a rolling restart to ensure schema matches across nodes. Run
nodetool describecluster

around your cluster. Check that there is only one schema version.

ON EACH NODE:

3) Check your filesystem and see if you have two directories for the table in

question in the data directory.

If THERE ARE TWO OR MORE DIRECTORIES:

4)Identify from schema_column_families which cf ID is the "new" one
(currently in use).

cqlsh -e "select * from system.schema_column_families"|grep 

5) Move the data from the "old" one to the "new" one and remove the
old directory.

6) If there are multiple "old" ones repeat 5 for every "old" directory.

7) run nodetool refresh

IF THERE IS ONLY ONE DIRECTORY:

No further action is needed.

All the best,

[image: datastax_logo.png] 

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png]  [image:
facebook.png]  [image: twitter.png]
 [image: g+.png]

DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Tue, Nov 10, 2015 at 6:09 PM, Maciek Sakrejda  wrote:

> Oh and for what it's worth, I've also looked through the logs for this
> node, and the oldest error in the logs seems to be:
>
> [] 06 Nov 22:10:53.260 * pri=ERROR t=Thrift:16
> at=CustomTThreadPoolServer.run Error occurred during processing of message.
> java.lang.RuntimeException: java.util.concurrent.ExecutionException:
> java.lang.RuntimeException:
> org.apache.cassandra.exceptions.ConfigurationException: Column family ID
> mismatch (found 3ed23e80-84d3-11e5-83b3-0fc0205655f5; expected
> 3ecce750-84d3-11e5-bdd9-dd7717dcdbd5)
>
> Then the logs show a compaction, and then the UnknownColumnFamilyException
> starts occuring.
> 
>

Re: Cassandra compaction stuck? Should I disable?

2015-11-10 Thread Sebastian Estevez

Turn off Swap.

http://docs.datastax.com/en/cassandra/2.1/cassandra/install/installRecommendSettings.html?scroll=reference_ds_sxl_gf3_2k__disable-swap


All the best,


[image: datastax_logo.png] 

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png]  [image:
facebook.png]  [image: twitter.png]
 [image: g+.png]







DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Tue, Nov 10, 2015 at 1:48 PM, PenguinWhispererThe . <
th3penguinwhispe...@gmail.com> wrote:

> I also have the following memory usage:
> [root@US-BILLINGDSX4 cassandra]# free -m
>  total   used   free sharedbuffers cached
> Mem: 12024   9455   2569  0110   2163
> -/+ buffers/cache:   7180   4844
> Swap: 2047  0   2047
>
> Still a lot free and a lot of free buffers/cache.
>
> 2015-11-10 19:45 GMT+01:00 PenguinWhispererThe . <
> th3penguinwhispe...@gmail.com>:
>
>> Still stuck with this. However I enabled GC logging. This shows the
>> following:
>>
>> [root@myhost cassandra]# tail -f gc-1447180680.log
>> 2015-11-10T18:41:45.516+: 225.428: [GC 2721842K->2066508K(6209536K),
>> 0.0199040 secs]
>> 2015-11-10T18:41:45.977+: 225.889: [GC 2721868K->2066511K(6209536K),
>> 0.0221910 secs]
>> 2015-11-10T18:41:46.437+: 226.349: [GC 2721871K->2066524K(6209536K),
>> 0.0222140 secs]
>> 2015-11-10T18:41:46.897+: 226.809: [GC 2721884K->2066539K(6209536K),
>> 0.0224140 secs]
>> 2015-11-10T18:41:47.359+: 227.271: [GC 2721899K->2066538K(6209536K),
>> 0.0302520 secs]
>> 2015-11-10T18:41:47.821+: 227.733: [GC 2721898K->2066557K(6209536K),
>> 0.0280530 secs]
>> 2015-11-10T18:41:48.293+: 228.205: [GC 2721917K->2066571K(6209536K),
>> 0.0218000 secs]
>> 2015-11-10T18:41:48.790+: 228.702: [GC 2721931K->2066780K(6209536K),
>> 0.0292470 secs]
>> 2015-11-10T18:41:49.290+: 229.202: [GC 2722140K->2066843K(6209536K),
>> 0.0288740 secs]
>> 2015-11-10T18:41:49.756+: 229.668: [GC 2722203K->2066818K(6209536K),
>> 0.0283380 secs]
>> 2015-11-10T18:41:50.249+: 230.161: [GC 2722178K->2067158K(6209536K),
>> 0.0218690 secs]
>> 2015-11-10T18:41:50.713+: 230.625: [GC 2722518K->2067236K(6209536K),
>> 0.0278810 secs]
>>
>> This is a VM with 12GB of RAM. Highered the HEAP_SIZE to 6GB and
>> HEAP_NEWSIZE to 800MB.
>>
>> Still the same result.
>>
>> This looks very similar to following issue:
>>
>> http://mail-archives.apache.org/mod_mbox/cassandra-user/201411.mbox/%3CCAJ=3xgRLsvpnZe0uXEYjG94rKhfXeU+jBR=q3a-_c3rsdd5...@mail.gmail.com%3E
>>
>> Is the only possibility to upgrade memory? I mean, I can't believe it's
>> just loading all it's data in memory. That would require to keep scaling up
>> the node to keep it work?
>>
>>
>> 2015-11-10 9:36 GMT+01:00 PenguinWhispererThe . <
>> th3penguinwhispe...@gmail.com>:
>>
>>> Correction...
>>> I was grepping on Segmentation on the strace and it happens a lot.
>>>
>>> Do I need to run a scrub?
>>>
>>> 2015-11-10 9:30 GMT+01:00 PenguinWhispererThe . <
>>> th3penguinwhispe...@gmail.com>:
>>>
 Hi Rob,

 Thanks for your reply.

 2015-11-09 23:17 GMT+01:00 Robert Coli :

> On Mon, Nov 9, 2015 at 1:29 PM, PenguinWhispererThe . <
> th3penguinwhispe...@gmail.com> wrote:
>>
>> In Opscenter I see one of the nodes is orange. It seems like it's
>> working on compaction. I used nodetool compactionstats and whenever I did
>> this the Completed nad percentage stays the same (even with hours in
>> between).
>>
> Are you the same person from IRC, or a second report today of
> compaction hanging in this way?
>
 Same person ;) Just didn't had things to work with from the chat there.
 I want to understand the issue more, see what I can tune or fix. I want to
 do nodetool repair before upgrading to 2.1.11 but the compaction is
 blocking it.

>
>
>
 What version of Cassandra?
>
 2.0.9

> I currently don't see cpu load from cassandra on that node. So it
>> seems stuck (somewhere mid 60%). Also some other nodes have compaction on
>> the same columnfamily. I don't see any progress.
>>
>>  WARN [RMI TCP Connection(554)-192.168.0.68] 2015-11-09 17:18:13,677 
>> ColumnFamilyStore.java (li

Re: Too many open files Cassandra 2.1.11.872

2015-11-06 Thread Sebastian Estevez

You probably need to configure ulimits correctly

.

What does this give you?

/proc//limits

All the best,

[image: datastax_logo.png] 

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png]  [image:
facebook.png]  [image: twitter.png]
 [image: g+.png]

DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Fri, Nov 6, 2015 at 1:56 PM, Branton Davis 
wrote:

> We recently went down the rabbit hole of trying to understand the output
> of lsof.  lsof -n has a lot of duplicates (files opened by multiple
> threads).  Use 'lsof -p $PID' or 'lsof -u cassandra' instead.
>
> On Fri, Nov 6, 2015 at 12:49 PM, Bryan Cheng 
> wrote:
>
>> Is your compaction progressing as expected? If not, this may cause an
>> excessive number of tiny db files. Had a node refuse to start recently
>> because of this, had to temporarily remove limits on that process.
>>
>> On Fri, Nov 6, 2015 at 10:09 AM, Jason Lewis 
>> wrote:
>>
>>> I'm getting too many open files errors and I'm wondering what the
>>> cause may be.
>>>
>>> lsof -n | grep java  show 1.4M files
>>>
>>> ~90k are inodes
>>> ~70k are pipes
>>> ~500k are cassandra services in /usr
>>> ~700K are the data files.
>>>
>>> What might be causing so many files to be open?
>>>
>>> jas
>>>
>>
>>
>

Re: Can't save Opscenter Dashboard

2015-11-04 Thread Sebastian Estevez

Are there any errors in your javascript console?

All the best,


[image: datastax_logo.png] <http://www.datastax.com/>

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png] <https://www.linkedin.com/company/datastax> [image:
facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
<https://twitter.com/datastax> [image: g+.png]
<https://plus.google.com/+Datastax/about>
<http://feeds.feedburner.com/datastax>
<http://goog_410786983>


<http://www.datastax.com/gartner-magic-quadrant-odbms>

DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Wed, Nov 4, 2015 at 3:46 PM, Kai Wang  wrote:

> No they don't.
>
> On Wed, Nov 4, 2015 at 3:42 PM, Sebastian Estevez <
> sebastian.este...@datastax.com> wrote:
>
>> Do they come back if you restart opscenterd?
>>
>> All the best,
>>
>>
>> [image: datastax_logo.png] <http://www.datastax.com/>
>>
>> Sebastián Estévez
>>
>> Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com
>>
>> [image: linkedin.png] <https://www.linkedin.com/company/datastax> [image:
>> facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
>> <https://twitter.com/datastax> [image: g+.png]
>> <https://plus.google.com/+Datastax/about>
>> <http://feeds.feedburner.com/datastax>
>> <http://goog_410786983>
>>
>>
>> <http://www.datastax.com/gartner-magic-quadrant-odbms>
>>
>> DataStax is the fastest, most scalable distributed database technology,
>> delivering Apache Cassandra to the world’s most innovative enterprises.
>> Datastax is built to be agile, always-on, and predictably scalable to any
>> size. With more than 500 customers in 45 countries, DataStax is the
>> database technology and transactional backbone of choice for the worlds
>> most innovative companies such as Netflix, Adobe, Intuit, and eBay.
>>
>> On Wed, Nov 4, 2015 at 3:41 PM, Kai Wang  wrote:
>>
>>> Forgot to mention. I am running OpsCenter 5.2.2.
>>>
>>> On Wed, Nov 4, 2015 at 3:39 PM, Kai Wang  wrote:
>>>
>>>> Hi,
>>>>
>>>> Today after one of the nodes is rebooted, OpsCenter dashboard doesn't
>>>> save anymore. It starts with an empty dashboard with no widget or graph. If
>>>> I add some graph/widget, they are being updated fine. But if I refresh the
>>>> browser, the dashboard became empty again.
>>>>
>>>> Also there's no "DEFAULT" tab on the dashboard as the user guide shows.
>>>> I am not sure if it was there before.
>>>>
>>>
>>>
>>
>

Re: Can't save Opscenter Dashboard

2015-11-04 Thread Sebastian Estevez

Do they come back if you restart opscenterd?

All the best,

[image: datastax_logo.png] 

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png]  [image:
facebook.png]  [image: twitter.png]
 [image: g+.png]

DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Wed, Nov 4, 2015 at 3:41 PM, Kai Wang  wrote:

> Forgot to mention. I am running OpsCenter 5.2.2.
>
> On Wed, Nov 4, 2015 at 3:39 PM, Kai Wang  wrote:
>
>> Hi,
>>
>> Today after one of the nodes is rebooted, OpsCenter dashboard doesn't
>> save anymore. It starts with an empty dashboard with no widget or graph. If
>> I add some graph/widget, they are being updated fine. But if I refresh the
>> browser, the dashboard became empty again.
>>
>> Also there's no "DEFAULT" tab on the dashboard as the user guide shows. I
>> am not sure if it was there before.
>>
>
>

Re: Cassandra -stress write - Storage location

2015-10-29 Thread Sebastian Estevez

You can do a describe table to see the table layout and you can select to
see some sample rows. Stress is pretty powerful though.

I just dropped a blog post tonight on doing more targeted benchmarking /
sizing with stress and my data modeler. Take a look:

http://www.sestevez.com/data-modeler/
On Oct 30, 2015 1:01 AM, "Arun Sandu"  wrote:

> Thanks . Can I know, the format of the data that gets stored? Can you
> please suggest me some ways to perform load testing? I need a big picture
> of all the statistics.
>
> Thanks again
> Arun
>
> On Thu, Oct 29, 2015 at 9:41 PM, Sebastian Estevez <
> sebastian.este...@datastax.com> wrote:
>
>> By default this will go in Keyspace1 Standard1.
>>
>> All the best,
>>
>>
>> [image: datastax_logo.png] <http://www.datastax.com/>
>>
>> Sebastián Estévez
>>
>> Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com
>>
>> [image: linkedin.png] <https://www.linkedin.com/company/datastax> [image:
>> facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
>> <https://twitter.com/datastax> [image: g+.png]
>> <https://plus.google.com/+Datastax/about>
>> <http://feeds.feedburner.com/datastax>
>> <http://goog_410786983>
>>
>>
>> <http://www.datastax.com/gartner-magic-quadrant-odbms>
>>
>> DataStax is the fastest, most scalable distributed database technology,
>> delivering Apache Cassandra to the world’s most innovative enterprises.
>> Datastax is built to be agile, always-on, and predictably scalable to any
>> size. With more than 500 customers in 45 countries, DataStax is the
>> database technology and transactional backbone of choice for the worlds
>> most innovative companies such as Netflix, Adobe, Intuit, and eBay.
>>
>> On Fri, Oct 30, 2015 at 12:07 AM, Arun Sandu 
>> wrote:
>>
>>> Hi,
>>>
>>> I am currently working on load testing my cluster. When we write 10
>>> to cassandra, where does the writes data gets stored in Cassandra and the
>>> same for read operation too.
>>>
>>> ./cassandra-stress write n=10 -rate threads=100 -node 10.34.100.13
>>>
>>> ./cassandra-stress read n=10 -node 10.34.100.13
>>>
>>>
>>> --
>>> Thanks
>>> Arun
>>>
>>
>>
>
>
> --
> Thanks&Regards
> Arun Kumar S
> 816-699-3039
>
> *"This Moment Is Not Permanent...!!"*
>

Re: Cassandra -stress write - Storage location

2015-10-29 Thread Sebastian Estevez

By default this will go in Keyspace1 Standard1.

All the best,

[image: datastax_logo.png] 

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png]  [image:
facebook.png]  [image: twitter.png]
 [image: g+.png]

DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Fri, Oct 30, 2015 at 12:07 AM, Arun Sandu  wrote:

> Hi,
>
> I am currently working on load testing my cluster. When we write 10 to
> cassandra, where does the writes data gets stored in Cassandra and the same
> for read operation too.
>
> ./cassandra-stress write n=10 -rate threads=100 -node 10.34.100.13
>
> ./cassandra-stress read n=10 -node 10.34.100.13
>
>
> --
> Thanks
> Arun
>

Re: Cassandra stalls and dropped messages not due to GC

2015-10-29 Thread Sebastian Estevez

The thing about the CASSANDRA-9504 theory is that it was solved in 2.1.6
and Jeff's running 2.1.11.

@Jeff

How often does this happen? Can you watch ttop as soon as you notice
increased read/write latencies?

wget
> https://bintray.com/artifact/download/aragozin/generic/sjk-plus-0.3.6.jar
> java -jar sjk-plus-0.3.6.jar ttop -s localhost:7199 -n 30 -o CPU


This should at least tell you which Cassandra threads are causing high
memory allocations  and CPU consumption.

All the best,


[image: datastax_logo.png] 

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png]  [image:
facebook.png]  [image: twitter.png]
 [image: g+.png]







DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Thu, Oct 29, 2015 at 9:36 PM, Graham Sanderson  wrote:

> you didn’t say what you upgraded from, but if it is 2.0.x, then look at
> CASSANDRA-9504
>
> If so and you use
>
> commitlog_sync: batch
>
> Then you probably want to set
>
> commitlog_sync_batch_window_in_ms: 1 (or 2)
>
> Note I’m only slightly convinced this is the cause because of your
> READ_REPAIR issues (though if you are dropping a lot of MUTATIONS under
> load and your machines are overloaded, you’d be doing more READ_REPAIR than
> usual probably)
>
> On Oct 29, 2015, at 8:12 PM, Jeff Ferland  wrote:
>
> Using DSE 4.8.1 / 2.1.11.872, Java version 1.8.0_66
>
> We upgraded our cluster this weekend and have been having issues with
> dropped mutations since then. Intensely investigating a single node and
> toying with settings has revealed that GC stalls don’t make up enough time
> to explain the 10 seconds of apparent stall that would cause a hangup.
>
> tpstats output typically shows active threads in the low single digits and
> pending similar or 0. Before a failure, pending MutationStage will
> skyrocket into 5+ digits. System.log regularly shows the gossiper
> complaining, then slow log complaints, then logs dropped mutations.
>
> For the entire minute of 00:55, the gc logging shows no single pause > .14
> seconds and most of them much smaller. Abbreviated GC log after switching
> to g1gc (problem also exhibited before G1GC):
>
> 2015-10-30T00:55:00.550+: 6752.857: [GC pause (G1 Evacuation Pause)
> (young)
> 2015-10-30T00:55:02.843+: 6755.150: [GC pause (GCLocker Initiated GC)
> (young)
> 2015-10-30T00:55:05.241+: 6757.548: [GC pause (G1 Evacuation Pause)
> (young)
> 2015-10-30T00:55:07.755+: 6760.062: [GC pause (G1 Evacuation Pause)
> (young)
> 2015-10-30T00:55:10.532+: 6762.839: [GC pause (G1 Evacuation Pause)
> (young)
> 2015-10-30T00:55:13.080+: 6765.387: [GC pause (G1 Evacuation Pause)
> (young)
> 2015-10-30T00:55:15.914+: 6768.221: [GC pause (G1 Evacuation Pause)
> (young)
> 2015-10-30T00:55:18.619+: 6770.926: [GC pause (GCLocker Initiated GC)
> (young)
> 2015-10-30T00:55:23.270+: 6775.578: [GC pause (GCLocker Initiated GC)
> (young)
> 2015-10-30T00:55:28.662+: 6780.969: [GC pause (GCLocker Initiated GC)
> (young)
> 2015-10-30T00:55:33.326+: 6785.633: [GC pause (G1 Evacuation Pause)
> (young)
> 2015-10-30T00:55:36.600+: 6788.907: [GC pause (G1 Evacuation Pause)
> (young)
> 2015-10-30T00:55:40.050+: 6792.357: [GC pause (G1 Evacuation Pause)
> (young)
> 2015-10-30T00:55:43.728+: 6796.035: [GC pause (G1 Evacuation Pause)
> (young)
> 2015-10-30T00:55:48.216+: 6800.523: [GC pause (G1 Evacuation Pause)
> (young)
> 2015-10-30T00:55:53.621+: 6805.928: [GC pause (G1 Evacuation Pause)
> (young)
> 2015-10-30T00:55:59.048+: 6811.355: [GC pause (GCLocker Initiated GC)
> (young)
>
> System log snippet of the pattern I’m seeing:
>
> WARN  [GossipTasks:1] 2015-10-30 00:55:25,129  Gossiper.java:747 - Gossip
> stage has 1 pending tasks; skipping status check (no nodes will be marked
> down)
> INFO  [CompactionExecutor:210] 2015-10-30 00:55:26,006
>  CompactionTask.java:141 - Compacting
> [SSTableReader(path='/mnt/cassandra/data/system/hints/system-hints-ka-8283-Data.db'),
> SSTableReader(path='/mnt/cassandra/data/system/hints/system-hints-ka-8286-Data.db'),
> SSTableReader(path='/mnt/cassandra/data/system/hints/system-hints-ka-8284-Data.db'),
> SSTableReader(path='/mnt/cassandra/data/system/hints/system-hints-ka-8285-Data.db'),
> SSTableReader(path='/mnt/cassandra/data/system/hints/system-hints-k

Re: Data visualization tools for Cassandra

2015-10-20 Thread Sebastian Estevez

For zeppelin check Duy Hai's branch:

https://github.com/doanduyhai/incubator-zeppelin/blob/Spark_Cassandra_Demo/README.md

All the best,


[image: datastax_logo.png] 

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png]  [image:
facebook.png]  [image: twitter.png]
 [image: g+.png]







DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Tue, Oct 20, 2015 at 1:28 PM, Mathieu Delsaut <
mathieu.dels...@univ-reunion.fr> wrote:

> Try apache zeppelin. It's a pretty young project but very useful.
> https://zeppelin.incubator.apache.org/
>
> Include a Cassandra and Spark connector among many others.
>
>
> Mathieu  Delsaut 
> *Research Engineer at LE²P*
> +262 (0)262 93 86 08
>  
> 
> 
>
> 2015-10-20 21:24 GMT+04:00 Jon Haddad :
>
>> PySpark (dataframes) + Pandas + Seaborn/Matplotlib
>>
>> On Oct 20, 2015, at 11:22 AM, Charles Rich  wrote:
>>
>> Take a look at jKool, a DataStax partner at jKoolCloud.com
>> .  It provides visualization for data in DSE.
>>
>> Regards,
>>
>> Charley
>>
>> *From:* Gene [mailto:gh5...@gmail.com ]
>> *Sent:* Tuesday, October 20, 2015 1:17 PM
>> *To:* user@cassandra.apache.org
>> *Subject:* Re: Data visualization tools for Cassandra
>>
>> Have you looked at OpsCenter?
>>
>> On Tue, Oct 20, 2015 at 9:59 AM, Vikram Kone 
>> wrote:
>> Hi,
>> We are looking for data visualization tools to chart some graphs over
>> data present in our cassandra cluster. Are there any open source
>> visualization tools that people are using to quickly draw some charts over
>> data in their cassandra tables? We are using Datastax version of cassandra,
>> in case that is relevant.
>>
>>
>> thanks
>>
>> [image: Nastel Technologies] 
>>
>>
>>
>>
>> *The information contained in this e-mail and in any attachment is
>> confidential andis intended solely for the use of the individual or entity
>> to which it is addressed.Access, copying, disclosure or use of such
>> information by anyone else is unauthorized. If you are not the intended
>> recipient, please delete the e-mail and refrain from use of such
>> information.*
>>
>>
>>
>

Re: "invalid global counter shard detected" warning on 2.1.3 and 2.1.10

2015-10-20 Thread Sebastian Estevez

Hi Branton,


>- How much should we be freaking out?
>
> The impact of this is possible counter inaccuracy (over counting or under
counting). If you are expecting counters to be exactly accurate, you are
already in trouble because they are not. This is because of the fact that
they are not idempotent operations operating in a distributed system
(you've probably read Aleksey's

post by now).

>
>- Why is this recurring?  If I understand what's happening, this is a
>self-healing process.  So, why would it keep happening?  Are we possibly
>using counters incorrectly?
>
> Even after running sstableupgrade, your counter cells will not be upgraded
until they have all been incremented. You may still seeing the warning
happening on pre 2.1 counter cells which have not been incremented yet.

>
>- What does it even mean that there were multiple shards for the same
>counter?  How does that situation even occur?
>
> We used to maintain "counter shards" at the sstable level in pre 2.1
counters. This means that on compaction or reads we would essentially add
the shards together when getting the value or merging the cells. This
caused a series of problems including the warning you are still seeing.
TL;DR, we now store the final value of the counter (not the
increment/shard) at the commitlog level and beyond in post 2.1 counters, so
this is no longer an issue. Again, read Aleksey's post

.

Many users started fresh tables after upgrading to 2.1, update only the new
tables, and added application logic to decide what table to read from.
Something like monthly tables works well if you're doing time series
counters, and would ensure that you stop seeing the warnings on the
new/active tables and get the benefits of 2.1 counters quickly.




All the best,


[image: datastax_logo.png] 

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png]  [image:
facebook.png]  [image: twitter.png]
 [image: g+.png]







DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Tue, Oct 20, 2015 at 12:21 PM, Branton Davis 
wrote:

> Howdy Cassandra folks.
>
> Crickets here and it's sort of unsettling that we're alone with this
> issue.  Is it appropriate to create a JIRA issue for this or is there maybe
> another way to deal with it?
>
> Thanks!
>
> On Sun, Oct 18, 2015 at 1:55 PM, Branton Davis  > wrote:
>
>> Hey all.
>>
>> We've been seeing this warning on one of our clusters:
>>
>> 2015-10-18 14:28:52,898 WARN  [ValidationExecutor:14]
>> org.apache.cassandra.db.context.CounterContext invalid global counter shard
>> detected; (4aa69016-4cf8-4585-8f23-e59af050d174, 1, 67158) and
>> (4aa69016-4cf8-4585-8f23-e59af050d174, 1, 21486) differ only in count; will
>> pick highest to self-heal on compaction
>>
>>
>> From what I've read and heard in the IRC channel, this warning could be
>> related to not running upgradesstables after upgrading from 2.0.x to
>> 2.1.x.  I don't think we ran that then, but we've been at 2.1 since last
>> November.  Looking back, the warnings start appearing around June, when no
>> maintenance had been performed on the cluster.  At that time, we had been
>> on 2.1.3 for a couple of months.  We've been on 2.1.10 for the last week
>> (the upgrade was when we noticed this warning for the first time).
>>
>> From a suggestion in IRC, I went ahead and ran upgradesstables on all the
>> nodes.  Our weekly repair also ran this morning.  But the warnings still
>> show up throughout the day.
>>
>> So, we have many questions:
>>
>>- How much should we be freaking out?
>>- Why is this recurring?  If I understand what's happening, this is a
>>self-healing process.  So, why would it keep happening?  Are we possibly
>>using counters incorrectly?
>>- What does it even mean that there were multiple shards for the same
>>counter?  How does that situation even occur?
>>
>> We're pretty lost here, so any help would be greatly appreciated.
>>
>> Thanks!
>>
>
>

Re: compact/repair shouldn't compete for normal compaction resources.

2015-10-19 Thread Sebastian Estevez

The validation compaction part of repair is susceptible to the compaction
throttling knob `nodetool getcompactionthroughput`
/ `nodetool setcompactionthroughput` and you can use that to tune down the
resources that are being used by repair.

Check out this post by driftx on advanced repair techniques
.

Given your other question, I agree with Raj that it might be a good idea to
decommission the new nodes rather than repairing depending on how much data
has made it to them and how tight you were on resources before adding nodes.

All the best,

[image: datastax_logo.png] 

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png]  [image:
facebook.png]  [image: twitter.png]
 [image: g+.png]

DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Sun, Oct 18, 2015 at 8:18 PM, Kevin Burton  wrote:

> I'm doing a big nodetool repair right now and I'm pretty sure the added
> overhead is impacting our performance.
>
> Shouldn't you be able to throttle repair so that normal compactions can
> use most of the resources?
>
> --
>
> We’re hiring if you know of any awesome Java Devops or Linux Operations
> Engineers!
>
> Founder/CEO Spinn3r.com
> Location: *San Francisco, CA*
> blog: http://burtonator.wordpress.com
> … or check out my Google+ profile
> 
>
>

Re: Removed node is not completely removed

2015-10-14 Thread Sebastian Estevez

We still keep endpoints in memory. Not sure how you git to this state but
try a rolling restart.
On Oct 14, 2015 9:43 AM, "Tom van den Berge" 
wrote:

> Thanks for that Michael, I did not know that. However, the node is not
> listed in the system.peers table on any node, so it seems that the problem
> is not in this table.
>
>
>
> On Wed, Oct 14, 2015 at 3:30 PM, Laing, Michael  > wrote:
>
>> Remember that the system keyspace uses LocalStrategy: each node has its
>> own set of system tables. -ml
>>
>> On Wed, Oct 14, 2015 at 9:17 AM, Tom van den Berge <
>> tom.vandenbe...@gmail.com> wrote:
>>
>>> Hi Carlos,
>>>
>>> I'm using 2.1.6. The mysterious node is not in the peers table. Any
>>> other ideas?
>>> One of my existing nodes is not present in the system.peers table,
>>> though. Should I be worried?
>>>
>>> Regards,
>>> Tom
>>>
>>> On Wed, Oct 14, 2015 at 2:27 PM, Carlos Rolo  wrote:
>>>
 Check system.peers table to see if the IP is still there. If so edit
 the table and remove the offending IP.

 You are probably running into this:
 https://issues.apache.org/jira/browse/CASSANDRA-6053

 Regards,

 Carlos Juzarte Rolo
 Cassandra Consultant

 Pythian - Love your data

 rolo@pythian | Twitter: cjrolo | Linkedin: 
 *linkedin.com/in/carlosjuzarterolo
 *
 Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649
 www.pythian.com

 On Wed, Oct 14, 2015 at 12:26 PM, Tom van den Berge <
 tom.vandenbe...@gmail.com> wrote:

> I have removed a node with nodetool removenode, which completed ok.
> Nodetool status does not list the node anymore.
>
> But since then, Im seeing messages in my other nodes log files
> referring to the removed node:
>
>  INFO [GossipStage:38] 2015-10-14 11:18:26,322 Gossiper.java (line
> 968) InetAddress /10.68.56.200 is now DOWN
>  INFO [GossipStage:38] 2015-10-14 11:18:26,324 StorageService.java
> (line 1891) Removing tokens [85070591730234615865843651857942052863] for /
> 10.68.56.200
>
>
> These two messages appear every minute.
> I've tried nodetool removenode again (Host ID not found) and
> removenode force (no token removals in process).
> The jmx unsafeAssassinateEndpoint gives a NullPointerException.
>
> What can I do to remove the node entirely?
>
>
>

 --




>>>
>>
>

Re: Post portem of a large Cassandra datacenter migration.

2015-10-12 Thread Sebastian Estevez

For 1 and 3 have you looked CASSANDRA-8611


For 4, you don't need to attach a profiler to check if GC is a problem.
Just grep the system log for GCInspector.

All the best,


[image: datastax_logo.png] 

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png]  [image:
facebook.png]  [image: twitter.png]
 [image: g+.png]





DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Fri, Oct 9, 2015 at 8:17 PM, Kevin Burton  wrote:

> We just finished up a pretty large migration of about 30 Cassandra boxes
> to a new datacenter.
>
> We'll be migrating to about 60 boxes here in the next month so scalability
> (and being able to do so cleanly) is important.
>
> We also completed an Elasticsearch migration at the same time.  The ES
> migration worked fine. A few small problems with it doing silly things with
> relocating nodes too often but all in all it was somewhat painless.
>
> At one point we were doing 200 shard reallocations in parallel and pushing
> about 2-4Gbit...
>
> The Cassandra migration, however, was a LOT harder.
>
> One quick thing I wanted to point out - we're hiring.  So if you're a
> killer Java Devops guy drop me an email
>
> Anyway.  Back to the story.
>
> Obviously we did a bunch of research before hand to make sure we had
> plenty of bandwidth.  This was a migration from Washington DC to Germany.
>
> Using iperf, we could consistently push about 2Gb back and forth between
> DC and Germany.  This includes TCP as we switched to using large window
> sizes.
>
> The big problem that we had, was that we could only bootstrap one node at
> a time.  The ends up taking a LOT more time because you have to keep
> checking on a node so that you can start the next one.
>
> I imagine one could write a coordinator script but we had so many problems
> with CS that it wouldn't have worked if we tried.
>
> We had 2-3 main problems.
>
> 1.  Sometimes streams would just stop and lock up.  No explanation why.
> They would just lock up and not resume.  We'd wait 10-15 minutes with no
> response.. This would require us abort and retry.  Had we updated to
> Cassandra 2.2 before hand I think the new resume support would work.
>
> 2.  Some of our keyspaces created by Thrift caused exceptions regarding
> "too few resources" when trying to bootstrap. Dropping these keyspaces
> fixed the problem.  They were just test keyspaces so it didn't matter.
>
> 3.  Because of #1, it's probably better to make sure you have 2x or more
> disk space on the remote end before you do the migration.  This way you can
> boot the same number of nodes you had before and just decommission the old
> ones quickly. (er use nodetool removenode - see below)
>
> 4.  We're not sure why, but our OLDER machines kept locking up during this
> process.  This kept requiring us to do a rolling restart on all the older
> nodes.  We suspect this is GC and we were seeing single cores to 100%.  I
> didn't have time to attach a profiler as were all burned out at this point
> and just wanted to get it over with.  This problem meant that #1 was
> exacerbated because our old boxes would either refuse to send streams or
> refuse to accept them.  It seemed to get better when we upgraded the older
> boxes to use Java 8.
>
> 5.  Don't use nodetool decommission if you have a large number of nodes.
> Instead, use nodetool removenode.  It's MUCH faster and does M-N
> replication between nodes directly.  The downside is that you go down to
> N-1 replicas during this process. However, it was easily 20-30x faster.
> This probably saved me about 5 hours of sleep!
>
> In hindsight, I'm not sure what we would have done differently.  Maybe
> bought more boxes.  Maybe upgraded to Cassandra 2.2 and probably java 8 as
> well.
>
> Setting up datacenter migration might have worked out better too.
>
> Kevin
>
> --
>
> We’re hiring if you know of any awesome Java Devops or Linux Operations
> Engineers!
>
> Founder/CEO Spinn3r.com
> Location: *San Francisco, CA*
> blog: http://burtonator.wordpress.com
> … or check out my Google+ profile
> 
>
>

Re: Realtime data and (C)AP

2015-10-08 Thread Sebastian Estevez

Renato, please watch this netflix video on consistency:

http://www.planetcassandra.org/blog/a-netflix-experiment-eventual-consistency-hopeful-consistency-by-christos-kalantzis/

All the best,

[image: datastax_logo.png] 

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png]  [image:
facebook.png]  [image: twitter.png]
 [image: g+.png]

DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Thu, Oct 8, 2015 at 7:34 PM, Jonathan Haddad  wrote:

> Your options are
>
> 1. Read & write at quorum
> 2. Recognize that, in general, if you've got a real need for Cassandra,
> your data is out of date almost immediately after you've read it no matter
> what guarantee your DB gives you, so you might as well just forget about
> ever getting the "right" answer because you probably can't even define what
> that is.
>
> On Thu, Oct 8, 2015 at 4:17 PM Renato Perini 
> wrote:
>
>> How the two things can fit together?
>> Cassandra endorses the AP side of the CAP theorem. So how Cassandra can
>> deliver realtime consistent data?
>> AFAIK, choosing a consistency level equals to ALL can be a huge
>> performance hit for C*, so, please, explain me why I should choose C*
>> for realtime data
>> that should be consistent across reads after the data has been written.
>>
>> Thank you.
>>
>>

Re: Does failing to run "nodetool cleanup" end up causing more data to be transferred during bootstrapping?

2015-10-07 Thread Sebastian Estevez

vnodes or single tokens?

All the best,

[image: datastax_logo.png] 

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png]  [image:
facebook.png]  [image: twitter.png]
 [image: g+.png]

DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Thu, Oct 8, 2015 at 12:06 AM, Kevin Burton  wrote:

> Let's say I have 10 nodes, I add 5 more, if I fail to run nodetool
> cleanup, is excessive data transferred when I add the 6th node?  IE do the
> existing nodes send more data to the 6th node?
>
> the documentation is unclear.  It sounds like the biggest problem is that
> the existing data causes things to become unbalanced due to "load" computed
> wrong".
>
> but I also think that the excessive data will be removed in the next major
> compaction and that nodetool cleanup just triggers a major compaction.
>
> Is my hypothesis correct?
>
> --
>
> We’re hiring if you know of any awesome Java Devops or Linux Operations
> Engineers!
>
> Founder/CEO Spinn3r.com
> Location: *San Francisco, CA*
> blog: http://burtonator.wordpress.com
> … or check out my Google+ profile
> 
>
>

Re: Node in DL status cannot be removed

2015-10-07 Thread Sebastian Estevez

Doesn't sound like it finished decommissioning (which takes time because it
will actually stream that node's data to other nodes who assume the token
ranges).

1) If your node still exists, bring it up and nodetool decommission. Wait
until the decommission finishes.

2) You can do nodetool removenode if you've already gotten rid of the node.

3) If all else fails, you can use the assassinate endpoint operation in jmx
to get rid of a node. But only if all else fails.

All the best,

[image: datastax_logo.png] 

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png]  [image:
facebook.png]  [image: twitter.png]
 [image: g+.png]

DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Wed, Oct 7, 2015 at 12:55 PM, Rock Zhang  wrote:

> Hi All,
>
>
> After I decommission the node, i see the node status is DL and always
> there, even after i removed the node, anybody knows how to remove it ?
> Many thanks.
>
>
> ubuntu@ip-172-31-30-145:~$ nodetool status
>
> Datacenter: DC1
>
> ===
>
> Status=Up/Down
>
> |/ State=Normal/Leaving/Joining/Moving
>
> --  AddressLoad   Tokens  OwnsHost ID
>   Rack
>
> DL  172.31.16.191  249.34 GB  256 ?
> 08638815-b721-46c4-b77c-af08285226db  RAC1
>
> UN  172.31.30.145  978.93 GB  256 ?
> bde31dd9-ff1d-4f2f-b28d-fe54d0531c51  RAC1
>
> UN  172.31.6.79506.86 GB  256 ?
> 15795fca-5425-41cd-909c-c1756715442a  RAC2
>
>
>
> Thanks
>
> Rock
>

Re: Fwd: Column family ID mismatch

2015-10-01 Thread Sebastian Estevez

Check this post & my response:

http://stackoverflow.com/questions/31576180/cassandra-2-1-system-schema-missing&ved=0CBoQFjAAahUKEwit6uK1uqHIAhXLcD4KHWsxDK8&usg=AFQjCNFfvPld2OOInC6B_M0X_QpfqHUbxA&sig2=aLeCqqe7vhpQlL10p9WFJA
On Oct 1, 2015 4:33 AM, "kedar"  wrote:

> Got this error again on a single node cassandra.
>
> Would appreciate some pointers.
>
>  Forwarded Message  Subject: Column family ID mismatch Date:
> Thu, 13 Aug 2015 20:44:04 +0530 From: kedar 
>  Reply-To: user@cassandra.apache.org To:
> user@cassandra.apache.org 
> 
>
> Hi All,
>
> My keyspace is created as:
>
> CREATE KEYSPACE  WITH replication = {'class':
> 'SimpleStrategy', 'replication_factor': '2'}  AND durable_writes = true;
>
> However I am running a single node cluster:
>
> ./nodetool status 
> Datacenter: datacenter1
> ===
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address Load   Tokens  Owns (effective)  Host
> ID   Rack
> XX  XXX.XXX.XXX.XX  3.73 GB256 100.0%  rack1
>
> And things were still running fine till today we encountered:
>
> ERROR [MigrationStage:1] 2015-08-13 01:58:49,249 CassandraDaemon.java:153
> - Exception in thread Thread[MigrationStage:1,5,main]
> java.lang.RuntimeException:
> org.apache.cassandra.exceptions.ConfigurationException: Column family ID
> mismatch (found ; expected )
> at
> org.apache.cassandra.config.CFMetaData.reload(CFMetaData.java:1125)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> org.apache.cassandra.db.DefsTables.updateColumnFamily(DefsTables.java:422)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> org.apache.cassandra.db.DefsTables.mergeColumnFamilies(DefsTables.java:295)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> org.apache.cassandra.db.DefsTables.mergeSchemaInternal(DefsTables.java:194)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> org.apache.cassandra.db.DefsTables.mergeSchema(DefsTables.java:166)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> org.apache.cassandra.service.MigrationManager$2.runMayThrow(MigrationManager.java:393)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> ~[na:1.7.0_65]
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> ~[na:1.7.0_65]
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> ~[na:1.7.0_65]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> [na:1.7.0_65]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65]
> Caused by: org.apache.cassandra.exceptions.ConfigurationException: Column
> family ID mismatch (found ; expected )
> at
> org.apache.cassandra.config.CFMetaData.validateCompatility(CFMetaData.java:1208)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> org.apache.cassandra.config.CFMetaData.apply(CFMetaData.java:1140)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> org.apache.cassandra.config.CFMetaData.reload(CFMetaData.java:1121)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> ... 11 common frames omitted
>
>
> Did notetool repair, probably didn't work so after few mins did a restart
> and then the problem went away.
>
> Need help in understanding what caused it and how it was resolved.
>
> Thanks
>
>
>
>

RE: Consistency Issues

2015-10-01 Thread Sebastian Estevez

You're running describe with CL quorum aren't you?

To see the inconsistency you'd have to check the system.schema_column
family tables on each node.
On Oct 1, 2015 8:07 AM, "Walsh, Stephen"  wrote:

> No such thing as a stupid questionJ
>
> I know they exist in some nodes, but if they replicated correctly is a
> different story.
>
> I’m  checking this one now,
>
>
>
> Ok, hooked up OpsCenter to see what it was saying,
>
> Out of the 100 keyspaces creted,
>
> 9 are missing one CF
>
> 2 are missing two CF’s
>
> 1 is missing three CF’s
>
>
>
> It looks like the replication of the tables did not complete to all nodes?
>
>
>
> Looking at each of the 4 nodes at the keyspace with 3 missing CF’s
>
> (via CQLSH_HOST=x.x.x.x cqlsh & “Describe keyspace XXX;”)
>
>
>
> Node 1 : has all CF’s
>
> Node 2 : has all CF’s
>
> Node 3 : has all CF’s
>
> Node 4 : has all CF’s
>
>
>
>
>
> This is indeed very strange….
>
>
>
>
>
> *From:* Carlos Alonso [mailto:i...@mrcalonso.com]
> *Sent:* 01 October 2015 12:05
> *To:* user@cassandra.apache.org
> *Subject:* Re: Consistency Issues
>
>
>
> And that's a stupid one, I know, but does the column you're trying to
> access actually exist?
>
>
> Carlos Alonso | Software Engineer | @calonso 
>
>
>
> On 1 October 2015 at 11:09, Walsh, Stephen 
> wrote:
>
> I did think of that and they are all the same version J
>
>
>
>
>
> *From:* Carlos Alonso [mailto:i...@mrcalonso.com]
> *Sent:* 01 October 2015 10:11
>
>
> *To:* user@cassandra.apache.org
> *Subject:* Re: Consistency Issues
>
>
>
> Hi Stephen.
>
>
>
> The UnknownColumnFamilyException made me thought of a possible schema
> disagreement in which any of your nodes has a different version and
> therefore you cannot reach quorum?
>
>
>
> Can you run nodetool describecluster and see if all nodes have the same
> schema versions?
>
>
>
> Cheers!
>
>
> Carlos Alonso | Software Engineer | @calonso 
>
>
>
> On 1 October 2015 at 09:49, Walsh, Stephen 
> wrote:
>
> If you’re looking for the clean-up of the old gen in the jvm heap, it
> doesn’t happen.
>
> We have a new gen turning 15 times before its pushed to old gen.
>
> Seems all our data only has a TTL of 10 seconds – very little data is sent
> to the old gen.
>
>
>
> Add in heap size of 8GB with a new gen size of 2GB, I don’t think gc is
> our issue.
>
>
>
>
>
> I’m more worried about error messages in the Cassandra log file that state.
>
>
>
>
>
> UnknownColumnFamilyException reading from socket; closing
>
> org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find
> cfId=cf411b50-6785-11e5-a435-e7be20c92086
>
>
>
> and
>
>
>
> cassandra OutboundTcpConnection.java:313 - error writing to Connection.
>
>
>
>
>
>
>
> But I really need to understand this best practice that was mentioned (on
> number of CF’s) by Jack Krupansky.
>
> Anyone more information on this?
>
>
>
>
>
> Many thanks for all your help guys keep it coming J
>
> Steve
>
>
>
> *From:* Ricardo Sancho [mailto:sancho.rica...@gmail.com]
> *Sent:* 01 October 2015 09:39
> *To:* user@cassandra.apache.org
> *Subject:* RE: Consistency Issues
>
>
>
> Can you tell us how much time your gcs are taking?
> Do you see any especially long ones?
>
> On 1 Oct 2015 09:37, "Walsh, Stephen"  wrote:
>
> There is no load balancer in front of Cassandra,  it’s in front of our
> application.
>
> Everyone seems hung up on this point? But it’s not the root causing of the
> inconsistency issue.
>
>
>
> Can anyone verify the best practice for number of CF’s?
>
>
>
>
>
> *From:* Robert Coli [mailto:rc...@eventbrite.com]
> *Sent:* 30 September 2015 18:45
> *To:* user@cassandra.apache.org
> *Subject:* Re: Consistency Issues
>
>
>
> On Wed, Sep 30, 2015 at 9:06 AM, Walsh, Stephen 
> wrote:
>
>
>
> We never had these issue with our first run. Its only when we added
> another 25% of writes.
>
>
>
> As Jack said, you are probably pushing your GC over a threshold, leading
> to long pause times and inability to meet quorum.
>
>
>
> As Sebastian said, you probably shouldn't need a load balancer in front of
> Cassandra.
>
>
>
> =Rob
>
>
>
> This email (including any attachments) is proprietary to Aspect Software,
> Inc. and may contain information that is confidential. If you have received
> this message in error, please do not read, copy or forward this message.
> Please notify the sender immediately, delete it from your system and
> destroy any copies. You may not further disclose or distribute this email
> or its attachments.
>
> This email (including any attachments) is proprietary to Aspect Software,
> Inc. and may contain information that is confidential. If you have received
> this message in error, please do not read, copy or forward this message.
> Please notify the sender immediately, delete it from your system and
> destroy any copies. You may not further disclose or distribute this email
> or its attachments.
>
>
>
> This email (including any attachments) is proprietary to Aspect Software,
> I

Re: Consistency Issues

2015-09-30 Thread Sebastian Estevez

Can you provide exact details on where your load balancer is? Like Michael
said, you shouldn't need one between your client and the c* cluster if
you're using a DataStax driver.

All the best,


[image: datastax_logo.png] 

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png]  [image:
facebook.png]  [image: twitter.png]
 [image: g+.png]





DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Wed, Sep 30, 2015 at 12:06 PM, Walsh, Stephen 
wrote:

> Many thanks all,
>
>
>
> The Load balancers are only between our own node and not as a middle-man
> to Cassandra. It’s just so we can push more data into Cassandra.
>
> The only reason we are not using 2.1.9 is time , we haven’t had time to
> test upgrades.
>
>
>
> I wasn’t able to find any best practices for number of CF, where do you
> see this documented?
>
> I see a lot of comments on 1,000 CF’s Vs 1,000 key spaces.
>
>
>
> Errors around a few times a second, about 10 or so.
>
> They are constant.
>
>
>
> Our TTL is 10 seconds on data with gc_grace_seconds set to 0 on each CF.
>
> We don’t seem to get any OOM errors.
>
>
>
> We never had these issue with our first run. Its only when we added
> another 25% of writes.
>
>
>
> Many thanks for taking the time to reply Jack
>
>
>
>
>
>
>
> *From:* Jack Krupansky [mailto:jack.krupan...@gmail.com]
> *Sent:* 30 September 2015 16:53
> *To:* user@cassandra.apache.org
> *Subject:* Re: Consistency Issues
>
>
>
> More than "low hundreds" (200 or 300 max, and preferably under 100) of
> tables/column families is not exactly a recommended best practice. You may
> be able to get it to work, but probably only with very heavy tuning (i.e.,
> lots of time and playing with options) on your own part. IOW, no quick and
> easy solution.
>
>
>
> The only immediate issue that pops to mind is that you are hitting a GC
> pause due to the large heap size and high volume.
>
>
>
> How frequent are these errors occurring? Like, how much data can you load
> before the first one pops up, and are they then frequent/constant or just
> occasionally/rarely?
>
>
>
> Can you test to see if you can see similar timeouts with say only 100 or
> 50 tables? At least that might isolate whether the issue relates at all to
> the number of tables vs. raw data rate or GC pause.
>
>
>
> Sometimes you can reduce/eliminate the GC pause issue by reducing the heap
> so that it is only modestly above the minimum required to avoid OOM.
>
>
>
>
> -- Jack Krupansky
>
>
>
> On Wed, Sep 30, 2015 at 11:22 AM, Walsh, Stephen 
> wrote:
>
> More information,
>
>
>
> I’ve just setup a NTP server to rule out any timing issues.
>
> And I also see this in the Cassandra node log files
>
>
>
> MessagingService-Incoming-/172.31.22.4] 2015-09-30 15:19:14,769
> IncomingTcpConnection.java:97 - UnknownColumnFamilyException reading from
> socket; closing
>
> org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find
> cfId=cf411b50-6785-11e5-a435-e7be20c92086
>
>
>
> Any idea what this is related too?
>
> All these tests are run with a clean setup of Cassandra  nodes followed by
> a nodetool repair.
>
> Before any data hits them.
>
>
>
>
>
> *From:* Walsh, Stephen [mailto:stephen.wa...@aspect.com]
> *Sent:* 30 September 2015 15:17
> *To:* user@cassandra.apache.org
> *Subject:* Consistency Issues
>
>
>
> Hi there,
>
>
>
> We are having some issues with consistency. I’ll try my best to explain.
>
>
>
> We have an application that was able to
>
> Write ~1000 p/s
>
> Read ~300 p/s
>
> Total CF created: 400
>
> Total Keyspaces created : 80
>
>
>
> On a 4 node Cassandra Cluster with
>
> Version 2.1.6
>
> Replication : 3
>
> Consistency  (Read & Write) : LOCAL_QUORUM
>
> Cores : 4
>
> Ram : 15 GB
>
> Heap Size 8GB
>
>
>
> This was fine and worked, but was pushing our application to the max.
>
>
>
> -
>
>
>
> Next we added a load balancer (HaProxy) to our application.
>
> So now we have 3 of our nodes talking to 4 Cassandra Nodes with a load of
>
> Write ~1250 p/s
>
> Read 0p/s
>
> Total CF created: 450
>
> Total Keyspaces created : 100
>
>
>
> On our application we now see
>
> Cassandra timeout during write query at consistency LOCAL_QUORUM (2
> replica were required but only 1 acknowledged the write)
>
> (we are using ja

Re: Do vnodes need more memory?

2015-09-23 Thread Sebastian Estevez

This is interesting, where are you seeing that you're collecting 50% of the
time? Is your env.sh the default? How much ram?

Also, can you run this tool and send a minute worth of thread info:

wget
https://bintray.com/artifact/download/aragozin/generic/sjk-plus-0.3.6.jar
java -jar sjk-plus-0.3.6.jar ttop -s localhost:7199 -n 30 -o CPU
On Sep 23, 2015 7:09 AM, "Tom van den Berge" 
wrote:

> I have two data centers, each with the same number of nodes, same hardware
> (CPUs, memory), Cassandra version (2.1.6), replication factory, etc. The
> only difference it that one data center uses vnodes, and the other doesn't.
>
> The non-vnode DC works fine (and has been for a long time) under
> production load: I'm seeing normal CPU and IO load and garbage collection
> figures. But the vnode DC is struggling very hard under the same load. It
> has been set up recently. The CPU load is very high, due to excessive
> garbage collection (>50% of the time is spent collecting).
>
> So it seems that Cassandra simply doesn't have enough memory. I'm trying
> to understand if this can be caused by the use of vnodes? Is there an
> sensible reason why vnodes would consume more memory than regular nodes? Or
> does any of you have the same experience? If not, I might be barking up the
> wrong tree here, and I would love to know it before upgrading my servers
> with more memory.
>
> Thanks,
> Tom
>

Re: Cassandra Summit 2015 Roll Call!

2015-09-23 Thread Sebastian Estevez

Hey guys, find me at the Startups booth! Looking forward to meeting some of
you in person :)
On Sep 22, 2015 8:44 PM, "Steve Robenalt"  wrote:

> I am here. Wearing my assorted Cassandra shirts from meetups and
> conferences. Would be happy to meet anyone from this mailing list because
> the conversations here have been very valuable as I have ramped up with
> Cassandra. And I passed my developer certification today. :-)  I am
> identifiable from my LinkedIn picture.
>
> Steve
>
>
>
> On Tue, Sep 22, 2015 at 8:19 PM, Mohammed Guller 
> wrote:
>
>> Hey everyone,
>>
>> I will be at the summit too on Wed and Thu.  I am giving a talk on
>> Thursday at 2.40pm.
>>
>>
>>
>> Would love to meet everyone on this list in person.  Here is an old
>> picture of mine:
>>
>>
>> https://events.mfactormeetings.com/accounts/register123/mfactor/datastax/events/dstaxsummit2015/guller.jpg
>>
>>
>>
>> Mohammed
>>
>>
>>
>> *From:* Carlos Alonso [mailto:i...@mrcalonso.com]
>> *Sent:* Tuesday, September 22, 2015 5:23 PM
>> *To:* user@cassandra.apache.org
>> *Subject:* Re: Cassandra Summit 2015 Roll Call!
>>
>>
>>
>> Hi guys.
>>
>>
>>
>> I'm already here and I'll be the whole Summit. I'll be doing a live demo
>> on Thursday on troubleshooting Cassandra production issues as a developer.
>>
>>
>>
>> This is me!! https://twitter.com/calonso/status/646352711454097408
>>
>>
>> Carlos Alonso | Software Engineer | @calonso
>> 
>>
>>
>>
>> On 22 September 2015 at 15:27, Jeff Jirsa 
>> wrote:
>>
>> I’m here. Will be speaking Wednesday on DTCS for time series workloads:
>> http://cassandrasummit-datastax.com/agenda/real-world-dtcs-for-operators/
>>
>>
>>
>> Picture if you recognize me, say hi:
>> https://events.mfactormeetings.com/accounts/register123/mfactor/datastax/events/dstaxsummit2015/jirsa.jpg
>>  (probably
>> wearing glasses and carrying a black Crowdstrike backpack)
>>
>>
>>
>> - Jeff
>>
>>
>>
>>
>>
>> *From: *Robert Coli
>> *Reply-To: *"user@cassandra.apache.org"
>> *Date: *Tuesday, September 22, 2015 at 11:27 AM
>> *To: *"user@cassandra.apache.org"
>> *Subject: *Cassandra Summit 2015 Roll Call!
>>
>>
>>
>> Cassandra Summit 2015 is upon us!
>>
>>
>>
>> Every year, the conference gets bigger and bigger, and the chance of IRL
>> meeting people you've "met" online gets smaller and smaller.
>>
>>
>>
>> To improve everyone's chances, if you are attending the summit :
>>
>>
>>
>> 1) respond on-thread with a brief introduction (and physical description
>> of yourself if you want others to be able to spot you!)
>>
>> 2) join #cassandra on freenode IRC (irc.freenode.org) to chat and
>> connect with other attendees!
>>
>>
>>
>> MY CONTRIBUTION :
>>
>> --
>>
>> I will be at the summit on Wednesday and Thursday. I am 5'8" or so, and
>> will be wearing glasses and either a red or blue "Eventbrite Engineering"
>> t-shirt with a graphic logo of gears on it. Come say hello! :D
>>
>>
>>
>> =Rob
>>
>>
>>
>>
>>
>
>
>
> --
> Steve Robenalt
> Software Architect
> sroben...@highwire.org 
> (office/cell): 916-505-1785
>
> HighWire Press, Inc.
> 425 Broadway St, Redwood City, CA 94063
> www.highwire.org
>
> Technology for Scholarly Communication
>

Re: Unable to remove dead node from cluster.

2015-09-21 Thread Sebastian Estevez

Order is decommission, remove, assassinate.

Which have you tried?
On Sep 21, 2015 10:47 AM, "Dikang Gu"  wrote:

> Hi there,
>
> I have a dead node in our cluster, which is a wired state right now, and
> can not be removed from cluster.
>
> The nodestatus shows:
> Datacenter: DC1
> ===
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address  Load   Tokens  OwnsHost ID
> Rack
> DN  10.210.165.55?  256 ?   null
>r1
>
> I tried the unsafeAssassinateEndpoint, but got exception like:
> 2015-09-18_23:21:40.79760 INFO  23:21:40 InetAddress /10.210.165.55 is
> now DOWN
> 2015-09-18_23:21:40.80667 ERROR 23:21:40 Exception in thread
> Thread[GossipStage:1,5,main]
> 2015-09-18_23:21:40.80668 java.lang.NullPointerException: null
> 2015-09-18_23:21:40.80669   at
> org.apache.cassandra.service.StorageService.getApplicationStateValue(StorageService.java:1584)
> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
> 2015-09-18_23:21:40.80669   at
> org.apache.cassandra.service.StorageService.getTokensFor(StorageService.java:1592)
> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
> 2015-09-18_23:21:40.80670   at
> org.apache.cassandra.service.StorageService.handleStateLeft(StorageService.java:1822)
> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
> 2015-09-18_23:21:40.80671   at
> org.apache.cassandra.service.StorageService.onChange(StorageService.java:1495)
> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
> 2015-09-18_23:21:40.80671   at
> org.apache.cassandra.service.StorageService.onJoin(StorageService.java:2121)
> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
> 2015-09-18_23:21:40.80672   at
> org.apache.cassandra.gms.Gossiper.handleMajorStateChange(Gossiper.java:1009)
> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
> 2015-09-18_23:21:40.80673   at
> org.apache.cassandra.gms.Gossiper.applyStateLocally(Gossiper.java:1113)
> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
> 2015-09-18_23:21:40.80673   at
> org.apache.cassandra.gms.GossipDigestAck2VerbHandler.doVerb(GossipDigestAck2VerbHandler.java:49)
> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
> 2015-09-18_23:21:40.80673   at
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:62)
> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
> 2015-09-18_23:21:40.80674   at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> ~[na:1.7.0_45]
> 2015-09-18_23:21:40.80674   at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> ~[na:1.7.0_45]
> 2015-09-18_23:21:40.80674   at java.lang.Thread.run(Thread.java:744)
> ~[na:1.7.0_45]
> 2015-09-18_23:21:40.85812 WARN  23:21:40 Not marking nodes down due to
> local pause of 10852378435 > 50
>
> Any suggestions about how to remove it?
> Thanks.
>
> --
> Dikang
>
>

Re: No schema agreement from live replicas

2015-09-16 Thread Sebastian Estevez

check nodetool describecluster to see the schema versions across your nodes.

A rolling restart will help propagate schema if you have disagreement
across your nodes.

Just FYI: We do a lot of schema creation and deletion on Cassandra
> (basically our CI pipeline does that), if that has anything to do with it.


You should not do schema creation and deletion automatically in cassandra
(especially from a distributed app) because you risk schema disagreement
which can lead to issues like this one:

http://stackoverflow.com/questions/31576180/cassandra-2-1-system-schema-missing

 If you do a schema modification from a single threaded process, always
check for schema agreement before moving forward using
https://datastax.github.io/java-driver/2.0.10/features/metadata/


All the best,


[image: datastax_logo.png] 

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png]  [image:
facebook.png]  [image: twitter.png]
 [image: g+.png]





DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Wed, Sep 16, 2015 at 5:19 PM, Saurabh Sethi 
wrote:

> We have a 3 node Cassandra cluster (version 2.2.0). Two of the nodes are
> in one subnet and third one in different subnet.
>
> Replication factor is 2 and two of the nodes are seed nodes. Consistency
> level for everything is QUORUM.
>
> We are getting the following warning:
>
> *WARN com.datastax.driver.core.Cluster - No schema agreement from live
> replicas after 30 s. The schema may not be up to date on some nodes.*
>
> I made sure that the clocks are in-sync.
>
> I then ran cleanup, repair and compact but that solves the problem only
> for couple of days, the warning then comes back.
>
> We were seeing this issue in 2.1.0 but stopped seeing it in 2.1.4 and
> 2.1.6. But this came back in version 2.2.0.
>
> Anyone has any idea on how to fix this problem?
>
> Just FYI: We do a lot of schema creation and deletion on Cassandra
> (basically our CI pipeline does that), if that has anything to do with it.
>
> Thanks,
> Saurabh
>

Re: Cassandra adding node issue (no UJ status)

2015-09-15 Thread Sebastian Estevez

Check https://issues.apache.org/jira/browse/CASSANDRA-8611

All the best,


[image: datastax_logo.png] 

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png]  [image:
facebook.png]  [image: twitter.png]
 [image: g+.png]





DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Tue, Sep 15, 2015 at 8:44 AM, Mark Greene  wrote:

> Hey Rock,
>
> I've seen this occur as well. I've come to learn that in some cases, like
> a network blip, the join can fail. There is usually something in the log to
> the effect of "Stream failed"
>
> When I encounter this issue, I make an attempt to bootstrap the new node
> again. If that doesn't help, I run a repair on the new node.
>
>
> On Tue, Sep 15, 2015 at 3:14 AM, Rock Zhang  wrote:
>
>> Hi All,
>>
>>
>> Now I got a problem everything when add new node, the data is not balance
>> again, just add it as new node.
>>
>>
>> Originally I saw the UJ status, anybody experienced this kind of issue ?
>> I don't know what changed .
>>
>>
>> ubuntu@ip-172-31-15-242:/etc/cassandra$ nodetool status rawdata
>>
>> Datacenter: DC1
>>
>> ===
>>
>> Status=Up/Down
>>
>> |/ State=Normal/Leaving/Joining/Moving
>>
>> --  AddressLoad   Tokens  Owns (effective)  Host ID
>> Rack
>>
>> UN  172.31.16.191  742.15 GB  256 21.1%
>> 08638815-b721-46c4-b77c-af08285226db  RAC1
>>
>> UN  172.31.30.145  774.42 GB  256 21.2%
>> bde31dd9-ff1d-4f2f-b28d-fe54d0531c51  RAC1
>>
>> UN  172.31.6.79592.9 GB   256 19.8%
>> 15795fca-5425-41cd-909c-c1756715442a  RAC2
>>
>> UN  172.31.27.186  674.42 GB  256 18.4%
>> 9685a476-1da7-4c6f-819e-dd4483e3345e  RAC1
>>
>> UN  172.31.7.31642.47 GB  256 19.9%
>> f7c8c6fb-ab37-4124-ba1a-a9a1beaecc1b  RAC1
>>
>> *UN  172.31.15.242  37.4 MB256 19.8%
>> c3eff010-9904-49a0-83cd-258dc5a98525  RAC1*
>>
>> UN  172.31.24.32   780.59 GB  256 20.1%
>> ffa58bd1-3188-440d-94c9-97166ee4b735  RAC1
>>
>> *UN  172.31.3.4080.75 GB   256 18.9%
>> 01ce3f96-ebc0-4128-9ec3-ddd1a9845d51  RAC1*
>>
>> UN  172.31.31.238  756.59 GB  256 19.9%
>> 82d34a3b-4f12-4874-816c-7d89a0535577  RAC1
>>
>> UN  172.31.31.99   583.68 GB  256 20.8%
>> 2b10194f-23d2-4bdc-bcfa-7961a149cd11  RAC2
>>
>>
>> Thanks
>>
>> Rock
>>
>
>

Re: Importing data from SQL Server

2015-09-14 Thread Sebastian Estevez

If you have a csv, try Brian's cassandra-loader. It is a full featured c*
java import program built with all the best practices for data loading and
writes.

https://github.com/brianmhess/cassandra-loader

All the best,


[image: datastax_logo.png] 

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png]  [image:
facebook.png]  [image: twitter.png]
 [image: g+.png]





DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Mon, Sep 14, 2015 at 11:55 AM, Jason Kushmaul <
jkushm...@rocketfuelinc.com> wrote:

> I had to write my own, but not due to lack of support.  I found I needed
> to preprocess the data before I put it into cassandra, you might find that
> beneficial. I only had 3 massive tables to worry about so it wasn't that
> much extra work to code it out - your case might be different if you have
> 50 tables, or if this needs to recur.
>
>  If you don't need to preprocess the data, what you could probably do is
> export your data from MSSQL to CSV (built into sql mgmt studio)
>  and use CQL to import the CSV -
> http://www.datastax.com/dev/blog/simple-data-importing-and-exporting-with-cassandra
>
>
> On Mon, Sep 14, 2015 at 11:37 AM, Raluca Marcu 
> wrote:
>
>>
>> Kevin Burton  charter.net> writes:
>>
>> >
>> > I have seen numerous posts on transferring data from MySql to Cassandra
>> but have yet to find a good way to
>> > transfer directly from a Microsoft SQL Server table to a Cassandra CF.
>> Even better would be a method to take
>> > as input the output of an arbitrary SQL query. Ideas?
>> >
>>
>>
>> Hello,
>>
>> I realize this post is pretty old, but I was wondering if you found an
>> answer to this question and if you've used it.
>> I have to load data from SQL Server into Cassandra and I am completely new
>> to Cassandra and all of the posts I seem to be able to find are demos
>> about loading from MySQL into Cassandra.
>> Any, any help would be extremely appreciated!
>> Thank you very much!
>> Raluca
>>
>>
>>
>>
>>
>>
>
>
> --
> Jason Kushmaul | 517.899.7852
> Engineering Manager
>

Re: Why I can not do a "count(*) ... allow filtering " without facing operation timeout?

2015-09-04 Thread Sebastian Estevez

I hope this is not a production query...

All the best,

[image: datastax_logo.png] 

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png]  [image:
facebook.png]  [image: twitter.png]
 [image: g+.png]

DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Fri, Sep 4, 2015 at 4:34 AM, Tommy Stendahl 
wrote:

> Hi,
>
> Checkout CASSANDRA-8899, my guess is that you have to increase the timeout
> in cqlsh.
>
> /Tommy
>
>
> On 2015-09-04 10:31, shahab wrote:
>
> Hi,
>
> This is probably a silly problem , but it is really serious for me. I have
> a cluster of 3 nodes, with replication factor 2. But still I can not do a
> simple "select count(*) from ..."  neither using DevCenter nor "cqlsh" .
> Any idea how this can be done?
>
> best,
> /Shahab
>
>
>

Re: Adding New Nodes/Data Center to an existing Cluster.

2015-09-01 Thread Sebastian Estevez

DSE 4.7 ships with Cassandra 2.1 for stability.

All the best,


[image: datastax_logo.png] <http://www.datastax.com/>

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png] <https://www.linkedin.com/company/datastax> [image:
facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
<https://twitter.com/datastax> [image: g+.png]
<https://plus.google.com/+Datastax/about>
<http://feeds.feedburner.com/datastax>

<http://cassandrasummit-datastax.com/?utm_campaign=summit15&utm_medium=summiticon&utm_source=emailsignature>

DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Tue, Sep 1, 2015 at 12:53 PM, Sachin Nikam  wrote:

> @Neha,
> We are using DSE 4.7 & Cassandra 2.2
>
> @Alain,
> I will check with out OPS team about repair vs rebuild and get back to you.
> Regards
> Sachin
>
> On Tue, Sep 1, 2015 at 5:59 AM, Alain RODRIGUEZ 
> wrote:
>
>> Hi Sachin,
>>
>> You are speaking about a repair, when the proper command to do this is
>> "rebuild" ?
>>
>> Did you tried adding your DC this way:
>> http://docs.datastax.com/en/cassandra/2.1/cassandra/operations/ops_add_dc_to_cluster_t.html
>>  ?
>>
>>
>> 2015-09-01 5:32 GMT+02:00 Neha Trivedi :
>>
>>> Hi,
>>> Can you specify which version of Cassandra you are using?
>>> Can you provide the Error Stack ?
>>>
>>> regards
>>> Neha
>>>
>>> On Tue, Sep 1, 2015 at 2:56 AM, Sebastian Estevez <
>>> sebastian.este...@datastax.com> wrote:
>>>
>>>> or https://issues.apache.org/jira/browse/CASSANDRA-8611 perhaps
>>>>
>>>> All the best,
>>>>
>>>>
>>>> [image: datastax_logo.png] <http://www.datastax.com/>
>>>>
>>>> Sebastián Estévez
>>>>
>>>> Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com
>>>>
>>>> [image: linkedin.png] <https://www.linkedin.com/company/datastax> [image:
>>>> facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
>>>> <https://twitter.com/datastax> [image: g+.png]
>>>> <https://plus.google.com/+Datastax/about>
>>>> <http://feeds.feedburner.com/datastax>
>>>>
>>>>
>>>> <http://cassandrasummit-datastax.com/?utm_campaign=summit15&utm_medium=summiticon&utm_source=emailsignature>
>>>>
>>>> DataStax is the fastest, most scalable distributed database
>>>> technology, delivering Apache Cassandra to the world’s most innovative
>>>> enterprises. Datastax is built to be agile, always-on, and predictably
>>>> scalable to any size. With more than 500 customers in 45 countries, 
>>>> DataStax
>>>> is the database technology and transactional backbone of choice for the
>>>> worlds most innovative companies such as Netflix, Adobe, Intuit, and eBay.
>>>>
>>>> On Mon, Aug 31, 2015 at 5:24 PM, Eric Evans 
>>>> wrote:
>>>>
>>>>>
>>>>> On Mon, Aug 31, 2015 at 1:32 PM, Sachin Nikam 
>>>>> wrote:
>>>>>
>>>>>> When we add 3 more nodes in Data Center B, the repair tool starts
>>>>>> syncing the data between 2 data centers and then gives up after ~2 days.
>>>>>>
>>>>>> Has anybody run in to similar issue before? If so what is the
>>>>>> solution?
>>>>>>
>>>>>
>>>>> https://issues.apache.org/jira/browse/CASSANDRA-9624, maybe?
>>>>>
>>>>>
>>>>> --
>>>>> Eric Evans
>>>>> eev...@wikimedia.org
>>>>>
>>>>
>>>>
>>>
>>
>

Re: Re : Decommissioned node appears in logs, and is sometimes marked as "UNREACHEABLE" in `nodetool describecluster`

2015-09-01 Thread Sebastian Estevez

Are they in the system.peers table?
On Aug 28, 2015 4:21 PM, "sai krishnam raju potturi" 
wrote:

> We are using DSE on our clusters.
>
> DSE version : 4.6.7
> Cassandra version : 2.0.14
>
> thanks
> Sai Potturi
>
>
>
> On Fri, Aug 28, 2015 at 3:40 PM, Robert Coli  wrote:
>
>> On Fri, Aug 28, 2015 at 11:32 AM, sai krishnam raju potturi <
>> pskraj...@gmail.com> wrote:
>>
>>> we decommissioned nodes in a datacenter a while back. Those nodes
>>> keep showing up in the logs, and also sometimes marked as UNREACHABLE when
>>> `nodetool describecluster` is run.
>>>
>>
>> What version of Cassandra?
>>
>> This happens a lot in 1.0-2.0.
>>
>> =Rob
>>
>
>

Re: Adding New Nodes/Data Center to an existing Cluster.

2015-08-31 Thread Sebastian Estevez

or https://issues.apache.org/jira/browse/CASSANDRA-8611 perhaps

All the best,

[image: datastax_logo.png] 

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png]  [image:
facebook.png]  [image: twitter.png]
 [image: g+.png]

DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Mon, Aug 31, 2015 at 5:24 PM, Eric Evans  wrote:

>
> On Mon, Aug 31, 2015 at 1:32 PM, Sachin Nikam  wrote:
>
>> When we add 3 more nodes in Data Center B, the repair tool starts syncing
>> the data between 2 data centers and then gives up after ~2 days.
>>
>> Has anybody run in to similar issue before? If so what is the solution?
>>
>
> https://issues.apache.org/jira/browse/CASSANDRA-9624, maybe?
>
>
> --
> Eric Evans
> eev...@wikimedia.org
>

Re: Is ZooKeeper still use in Cassandra?

2015-08-30 Thread Sebastian Estevez

No it does not. Where are you reading this?
On Aug 30, 2015 5:09 PM, "ibrahim El-sanosi" 
wrote:

> Hi folks,
>
> I read Cassandra white paper, I come across a text says "Cassandra system
> elects a leader amongst its nodes using a system called Zookeeper[13]. All
> nodes on joining the cluster contact the leader who tells them for what
> ranges they are replicas for and leader makes a concerted effort to
> maintain the invariant that no node is responsible for more than N-1 ranges
> in the ring. The metadata about the ranges a node is responsible is cached
> locally at each node and in a fault-tolerant manner inside Zookeeper - this
> way a node that crashes and comes back up knows what ranges it was
> responsible for."
>
> Does Cassandra still use ZooKeeper? if yes can you refer me to any related
> article?
>
> Regards,
>
> Ibrahim
>

Re: Delete semantics

2015-08-24 Thread Sebastian Estevez

Hi Cameron,

INSERTs did not always have the ability to do where in's and the
functionality has not not been ported to DELETEs. This Jira should give you
what you're looking for (ETA 3.0 beta 2):

CASSANDRA-6237 

Check out CASSANDRA-6446
 for details on how
range tombstones work, note their effects on performance even with this
patch.


All the best,


[image: datastax_logo.png] 

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png]  [image:
facebook.png]  [image: twitter.png]
 [image: g+.png]





DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Fri, Aug 21, 2015 at 7:41 PM, Cameron Little 
wrote:

> Can anyone help me understand the semantics of the DELETE cql statement,
> specifically the WHERE… part?
>
> Taken literally, the datastax documentation at
> http://docs.datastax.com/en/cql/3.1/cql/cql_reference/delete_r.html seems
> to indicate a single row specification can be used.
>
> The documentation at
> https://cassandra.apache.org/doc/cql3/CQL.html#deleteStmt seems to
> indicate that the row specifications can be in any order.
>
>
> Here’s what I’ve found so far from testing.
>
> - Identifiers must be primary key columns.
> - A single IN clause ( IN '('  ')') is allowed for
> the  first primary key column
> - Mutliple = clauses ( '=' ) are allowed, starting with
> the first primary key column (not already used), not skipping any, and not
> appearing before an IN clause
>
> For example, the following work for the table:
>
> CREATE TABLE mpk_store (
>   pk_one text,
>   pk_two text,
>   pk_three text,
>   four text,
>   PRIMARY KEY (pk_one, pk_two, pk_three)
> )
>
> DELETE FROM mpk_store WHERE pk_one IN ('a', 'b') AND pk_two = 'a';
> DELETE FROM mpk_store WHERE pk_one IN ('a', 'b') AND pk_two = 'a' AND
> pk_three = 'b';
> DELETE FROM mpk_store WHERE pk_one IN ('a', 'b');
> DELETE FROM mpk_store WHERE pk_one = 'a';
>
> The following return Bad Request errors:
>
> DELETE FROM mpk_store WHERE pk_one IN ('a', 'b') AND pk_two IN ('a', 'b');
> DELETE FROM mpk_store WHERE pk_one = 'test_fetch_partial_limit' AND pk_two
> IN ('a', 'b');
> DELETE FROM mpk_store WHERE pk_one IN ('a', 'b') AND pk_two IN ('a', 'b')
> AND pk_three = 'b';
>
> This is a bit weird, since select allows IN clauses anywhere in the
> statement.
>
>
> Can anyone help explain these semantics or why Cassandra does this?
>
> Thanks,
> Cameron Little
>
>

Re: Question about how to remove data

2015-08-21 Thread Sebastian Estevez

To clarify, you do not need a ttl for deletes to be compacted away in
Cassandra. When you delete, we create a tombstone which will remain in the
system __at least__ gc grace seconds. We wait this long to give the
tombstone a chance to make it to all replica nodes, the best practice is to
run repairs as often as gc grace seconds in order to ensure edge cases
where data comes back to life (i.e. the tombstone was never sent to one of
your replicas and when the tombstones and data are removed from the other
two replicas, all that is left is the old value.

__at least__ are the key words in the previous paragraph, there are more
conditions that need to be met in order for a tombstone to actually get
cleaned up. As most things in Cassandra, these conditions are configurable
(via the following compaction sub-properties):

http://docs.datastax.com/en/cassandra/2.1/cassandra/operations/ops_configure_compaction_t.html

All the best,

[image: datastax_logo.png] 

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png]  [image:
facebook.png]  [image: twitter.png]
 [image: g+.png]

DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Thu, Aug 20, 2015 at 4:13 PM, Daniel Chia  wrote:

> The TTL shouldn't matter if you deleted the data, since to my
> understanding the delete should shadow the data signaling to C* that the
> data is a candidate for removal on compaction.
>
> Others might know better, but it could very well be the fact that
> gc_grace_seconds is 0 that is causing your problems. Others might have
> other suggestions, but you could potentially use sstable2json to see the
> raw contents of the sstable on disk and see why data is still there.
>
> Thanks,
> Daniel
>
> On Thu, Aug 20, 2015 at 12:55 PM, Analia Lorenzatto <
> analialorenza...@gmail.com> wrote:
>
>> Hello,
>>
>> Daniel, I am using Size Tiered compaction.
>>
>> My concern is that as I do not have a TTL defined on the Column family,
>> and I do not have the possibility to create it.   Perhaps, the "deleted
>> data" is never actually going to be removed?
>>
>> Thanks a lot!
>>
>>
>> On Thu, Aug 20, 2015 at 4:24 AM, Daniel Chia 
>> wrote:
>>
>>> Is this a LCS family, or Size Tiered? Manually running compaction on LCS
>>> doesn't do anything until C* 2.2 (
>>> https://issues.apache.org/jira/browse/CASSANDRA-7272)
>>>
>>> Thanks,
>>> Daniel
>>>
>>> On Wed, Aug 19, 2015 at 6:56 PM, Analia Lorenzatto <
>>> analialorenza...@gmail.com> wrote:
>>>
 Hello Michael,

 Thanks for responding!

 I do not have snapshots on any node of the cluster.

 Saludos / Regards.

 Analía Lorenzatto.

 "Hapiness is not something really made. It comes from your own actions"
 by Dalai Lama

 On 19 Aug 2015 6:19 pm, "Laing, Michael" 
 wrote:

> Possibly you have snapshots? If so, use nodetool to clear them.
>
> On Wed, Aug 19, 2015 at 4:54 PM, Analia Lorenzatto <
> analialorenza...@gmail.com> wrote:
>
>> Hello guys,
>>
>> I have a cassandra cluster 2.1 comprised of 4 nodes.
>>
>> I removed a lot of data in a Column Family, then I ran manually a
>> compaction on this Column family on every node.   After doing that, If I
>> query that data, cassandra correctly says this data is not there.  But 
>> the
>> space on disk is exactly the same before removing that data.
>>
>> Also, I realized that  gc_grace_seconds = 0.  Some people on the
>> internet say that it could produce zombie data, what do you think?
>>
>> I do not have a TTL defined on the Column family, and I do not have
>> the possibility to create it.   So my questions is, given that I do not
>> have a TTL defined is data going to be removed?  or the deleted data is
>> never actually going to be deleted due to I do not have a TTL?
>>
>>
>> Thanks in advance!
>>
>> --
>> Saludos / Regards.
>>
>> Analía Lorenzatto.
>>
>> “It's possible to commit no errors and still lose. That is not
>> weakness.  That is life".  By Captain Jean-Luc Picard.
>>
>
>
>>>
>>
>>
>> --
>> Saludos / Regards.
>>
>> Analía Lorenzatto.
>>
>> “It's possible to commit no errors and still lose. That is not w

Re: Written data is lost and no exception thrown back to the client

2015-08-21 Thread Sebastian Estevez

Please let us know the Jira number.

All the best,


[image: datastax_logo.png] 

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png]  [image:
facebook.png]  [image: twitter.png]
 [image: g+.png]





DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Fri, Aug 21, 2015 at 7:06 AM, Robert Wille  wrote:

> But it shouldn’t matter. I have missing data, and no errors, which
> shouldn’t be possible except with CL=ANY.
>
> FWIW, I’m working on some sample code so I can post a Jira.
>
> Robert
>
>
> On Aug 21, 2015, at 5:04 AM, Robert Wille  wrote:
>
> RF=1 with QUORUM consistency. I know QUORUM is weird with RF=1, but it
> should be the same as ONE. If’s QUORUM instead of ONE because production
> has RF=3, and I was running this against my test cluster with RF=1.
>
> On Aug 20, 2015, at 7:28 PM, Jason  wrote:
>
> What consistency level were the writes?
> --
> From: Robert Wille 
> Sent: ‎8/‎20/‎2015 18:25
> To: user@cassandra.apache.org
> Subject: Written data is lost and no exception thrown back to the client
>
> I wrote a data migration application which I was testing, and I pushed it
> too hard and the FlushWriter thread pool blocked, and I ended up with
> dropped mutation messages. I compared the source data against what is in my
> cluster, and as expected I have missing records. The strange thing is that
> my application didn’t error out. I’ve been doing some forensics, and
> there’s a lot about this that makes no sense and makes me feel very uneasy.
>
> I use a lot of asynchronous queries, and I thought it was possible that I
> had bad error handling, so I checked for errors in other, independent ways.
>
> I have a retry policy that on the first failure logs the error and then
> requests a retry. On the second failure it logs the error and then
> rethrows. A few retryable errors appeared in my logs, but no fatal errors.
> In theory, I should have a fatal error in my logs for any error that gets
> reported back to the client.
>
> I wrap my Session object, and all queries go through this wrapper. This
> wrapper logs all query errors. Synchronous queries are wrapped in a
> try/catch which logs and rethrows. Asynchronous queries use a
> FutureCallback to log any onFailure invocations.
>
> My logs indicate that no errors whatsoever were reported back to me. I do
> not understand how I can get dropped mutation messages and not know about
> it. I am running 2.0.16 with datastax Java driver 2.0.8. Three node cluster
> with RF=1. If someone could help me understand how this can occur, I would
> greatly appreciate it. A database that errors out is one thing. A database
> that errors out and makes you think everything was fine is quite another.
>
> Thanks
>
> Robert
>
>
>
>

Re: Null pointer exception after delete in a table with statics

2015-08-19 Thread Sebastian Estevez

Can you include your read code?
On Aug 18, 2015 5:50 AM, "Hervé Rivière"  wrote:

> Hello,
>
>
>
>
>
> I have an issue with a  message="java.lang.NullPointerException"> when I query a table with static
> fields (without where clause) with Cassandra 2.1.8 / 2 nodes clusters.
>
>
>
> No more indication in the log :
>
> ERROR [SharedPool-Worker-1] 2015-08-18 10:39:02,549 QueryMessage.java:132
> - Unexpected error during query
>
> java.lang.NullPointerException: null
>
> ERROR [SharedPool-Worker-1] 2015-08-18 10:39:02,550 ErrorMessage.java:251
> - Unexpected exception during request
>
> java.lang.NullPointerException: null
>
>
>
>
>
> The scenario was :
>
> 1) loading data inside the table with spark (~12 million rows)
>
> 2) Make some deletes with the primary keys and use the static fields to
> keep a certain state for each partition.
>
>
>
> The null pointer exception occurs when I query all the table after I made
> some deletions.
>
>
>
> I observed that :
>
> - Before delete statement the table is perfectly readable
>
> - It's repeatable  (I achieved to isolate ~20 delete statements that
> create a null pointer exception  when they are executed by cqlsh)
>
> - it occurs only  with some rows (nothing special in these rows compared
> to others)
>
> - Didn't succeed to repeat the problem with the problematic rows inside a
> toy table
>
> - repair/compact and scrub on each node before and after the deletes
> statements didn't change anything (always the null pointer exception after
> the delete)
>
> - Maybe related with static columns ?
>
>
>
> The table structure is :
>
> CREATE TABLE my_table (
>
> pk1 text,
>
> pk2 text,
>
> ck1 timestamp,
>
> ck2 text,
>
> ck3 text,
>
> valuefield text,
>
> staticField1 text static,
>
> staticField2 text static,
>
> PRIMARY KEY ((pk1, pk2), ck1, ck2, ck3)
>
> ) WITH CLUSTERING ORDER BY (pk1 DESC, pk2 ASC, ck1 ASC)
>
> AND bloom_filter_fp_chance = 0.01
>
> AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
>
> AND compaction = {'class':
> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'}
>
> AND compression = {'sstable_compression':
> 'org.apache.cassandra.io.compress.LZ4Compressor'}
>
> AND dclocal_read_repair_chance = 0.1
>
> AND default_time_to_live = 0
>
> AND gc_grace_seconds = 0
>
> AND max_index_interval = 2048
>
> AND memtable_flush_period_in_ms = 0
>
> AND min_index_interval = 128
>
> AND read_repair_chance = 0.0
>
> AND speculative_retry = '99.0PERCENTILE';
>
>
>
>
>
>
>
>
>
> Is someone already met this issue or has an idea to solve/investigate this
> exceptions ?
>
>
>
>
>
> Thank you
>
>
>
>
>
> Regards
>
>
>
>
>
> --
>
> Hervé
>

Re: Hash function

2015-08-11 Thread Sebastian Estevez

It's not the hash of the ip, there's some entropy in there for uniqueness.
On Aug 11, 2015 5:05 AM, "Thouraya TH"  wrote:

> Hi all,
>
> Each node in cassandra ring has a unique identifier "nodeID" of 128bytes, 
> obtained
> by a hashing of  ?
>
> ip address ?
>
>
>
> Thank you so much for help.
> Kind Regards.
>

Re: Long joining node

2015-08-05 Thread Sebastian Estevez

What's your average data per node? Is 230gb close?

All the best,


[image: datastax_logo.png] <http://www.datastax.com/>

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png] <https://www.linkedin.com/company/datastax> [image:
facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
<https://twitter.com/datastax> [image: g+.png]
<https://plus.google.com/+Datastax/about>
<http://feeds.feedburner.com/datastax>

<http://cassandrasummit-datastax.com/?utm_campaign=summit15&utm_medium=summiticon&utm_source=emailsignature>

DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Wed, Aug 5, 2015 at 8:33 AM, Stan Lemon  wrote:

> I set the stream timeout to 1 hour this morning and started fresh trying
> to join this node.  It took about an hour to stream over 230gb of data, and
> then into hour 2 I wound up back where I was yesterday, the node's load is
> slowly reducing and the netstats does not show sending or receiving
> anything.  I'm not sure how long I should wait before I throw the towel in
> on this attempt. I'm also not really sure what to try next...
>
> The only thing in the logs currently are three entries like this:
>
> ERROR 07:39:44,447 Exception in thread Thread[CompactionExecutor:31,1,main]
> java.lang.RuntimeException: Last written key
> DecoratedKey(8633837336094175369,
> 003076697369746f725f706167655f76696562393663623234633162366131393531363434663830383839346531313237374930303030663264632d303030302d303033302d343030302d3030303030303030663264633a66376436366166382d383564352d313165342d383030302d30303030303035343764623600)
> >= current key DecoratedKey(-6568345298384940765,
> 003076697369746f725f706167655f76696562393663623234633162366131393531363434663830383839346531313237374930303030376464652d303030302d303033302d343030302d3030303030303030376464653a64633930336533382d643766342d313165342d383030302d30303030303730626338386300)
> writing into
> /var/lib/cassandra/data/pi/__shardindex/pi-__shardindex-tmp-jb-644-Data.db
> at
> org.apache.cassandra.io.sstable.SSTableWriter.beforeAppend(SSTableWriter.java:143)
> at
> org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:166)
> at
> org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:170)
> at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
> at
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:60)
> at
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
> at
> org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:198)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
>
>
>
> ANY help is greatly appreciated.
>
> Thanks,
> Stan
>
>
>
>
>
> On Tue, Aug 4, 2015 at 2:23 PM, Sebastian Estevez <
> sebastian.este...@datastax.com> wrote:
>
>> That's the one. I set it to an hour to be safe (if a stream goes above
>> the timeout it will get restarted) but it can probably be lower.
>>
>> All the best,
>>
>>
>> [image: datastax_logo.png] <http://www.datastax.com/>
>>
>> Sebastián Estévez
>>
>> Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com
>>
>> [image: linkedin.png] <https://www.linkedin.com/company/datastax> [image:
>> facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
>> <https://twitter.com/datastax> [image: g+.png]
>> <https://plus.google.com/+Datastax/about>
>> <http://feeds.feedburner.com/datastax>
>>
>>
>> <http://cassandrasummit-datastax.com/?utm_campaign=summit15&utm_medium=summiticon&utm_source=emailsignature>
>>
>> DataStax is the fastest, most scalable distributed database technology,
>> delivering Apache Cassandra to the world’s most innovative enterprises.
>> Datastax is built to be agile, always-on, and predictably scalable to any
>

Re: Long joining node

2015-08-04 Thread Sebastian Estevez

That's the one. I set it to an hour to be safe (if a stream goes above the
timeout it will get restarted) but it can probably be lower.

All the best,


[image: datastax_logo.png] <http://www.datastax.com/>

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png] <https://www.linkedin.com/company/datastax> [image:
facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
<https://twitter.com/datastax> [image: g+.png]
<https://plus.google.com/+Datastax/about>
<http://feeds.feedburner.com/datastax>

<http://cassandrasummit-datastax.com/?utm_campaign=summit15&utm_medium=summiticon&utm_source=emailsignature>

DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Tue, Aug 4, 2015 at 2:21 PM, Stan Lemon  wrote:

> Sebastian,
> You're referring to streaming_socket_timeout_in_ms correct?  What value do
> you recommend?  All of my nodes are currently at the default 0.
>
> Thanks,
> Stan
>
>
> On Tue, Aug 4, 2015 at 2:16 PM, Sebastian Estevez <
> sebastian.este...@datastax.com> wrote:
>
>> It helps to set stream socket timeout in the yaml so that you don't hang
>> forever on a lost / broken stream.
>>
>> All the best,
>>
>>
>> [image: datastax_logo.png] <http://www.datastax.com/>
>>
>> Sebastián Estévez
>>
>> Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com
>>
>> [image: linkedin.png] <https://www.linkedin.com/company/datastax> [image:
>> facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
>> <https://twitter.com/datastax> [image: g+.png]
>> <https://plus.google.com/+Datastax/about>
>> <http://feeds.feedburner.com/datastax>
>>
>>
>> <http://cassandrasummit-datastax.com/?utm_campaign=summit15&utm_medium=summiticon&utm_source=emailsignature>
>>
>> DataStax is the fastest, most scalable distributed database technology,
>> delivering Apache Cassandra to the world’s most innovative enterprises.
>> Datastax is built to be agile, always-on, and predictably scalable to any
>> size. With more than 500 customers in 45 countries, DataStax is the
>> database technology and transactional backbone of choice for the worlds
>> most innovative companies such as Netflix, Adobe, Intuit, and eBay.
>>
>> On Tue, Aug 4, 2015 at 2:14 PM, Robert Coli  wrote:
>>
>>> On Tue, Aug 4, 2015 at 11:02 AM, Stan Lemon 
>>> wrote:
>>>
>>>> I am attempting to add a 13th node in one of the datacenters. I have
>>>> been monitoring this process from the node itself with nodetool netstats
>>>> and from one of the existing nodes using nodetool status.
>>>>
>>>> On the existing node I see the new node as UJ. I have watched the load
>>>> steadily climb up to about 203.4gb, and then over the last two hours it has
>>>> fluctuated a bit and has been steadily dropping to about 203.1gb
>>>>
>>>
>>> It's probably hung. If I were you I'd probably wipe the node and
>>> re-bootstrap.
>>>
>>> (what version of cassandra/what network are you on (AWS?)/etc.)
>>>
>>> =Rob
>>>
>>>
>>
>>
>

Re: Long joining node

2015-08-04 Thread Sebastian Estevez

It helps to set stream socket timeout in the yaml so that you don't hang
forever on a lost / broken stream.

All the best,

[image: datastax_logo.png] 

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png]  [image:
facebook.png]  [image: twitter.png]
 [image: g+.png]

DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Tue, Aug 4, 2015 at 2:14 PM, Robert Coli  wrote:

> On Tue, Aug 4, 2015 at 11:02 AM, Stan Lemon  wrote:
>
>> I am attempting to add a 13th node in one of the datacenters. I have been
>> monitoring this process from the node itself with nodetool netstats and
>> from one of the existing nodes using nodetool status.
>>
>> On the existing node I see the new node as UJ. I have watched the load
>> steadily climb up to about 203.4gb, and then over the last two hours it has
>> fluctuated a bit and has been steadily dropping to about 203.1gb
>>
>
> It's probably hung. If I were you I'd probably wipe the node and
> re-bootstrap.
>
> (what version of cassandra/what network are you on (AWS?)/etc.)
>
> =Rob
>
>

Re: OpsCenter datastax-agent 300% CPU

2015-07-20 Thread Sebastian Estevez

I recently became aware of a Java driver bug that might be causing similar
symptoms. Do you, perchance, have any keyspaces that have replication
defined against non-existent Data Centers?

https://datastax-oss.atlassian.net/browse/JAVA-702

If so, fixing that replication setting and restarting the agents should fix
this issue.

All the best,


[image: datastax_logo.png] <http://www.datastax.com/>

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png] <https://www.linkedin.com/company/datastax> [image:
facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
<https://twitter.com/datastax> [image: g+.png]
<https://plus.google.com/+Datastax/about>
<http://feeds.feedburner.com/datastax>

<http://cassandrasummit-datastax.com/>

DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Mon, Jul 20, 2015 at 11:02 AM, David Comer 
wrote:

> May I please be discontinued from this email?
>
>
>
> May I unsubscribe?
>
>
>
>
>
> *From:* John Wong [mailto:gokoproj...@gmail.com]
> *Sent:* Monday, July 20, 2015 8:37 AM
> *To:* user@cassandra.apache.org
> *Subject:* Re: OpsCenter datastax-agent 300% CPU
>
>
>
> Hi all & Sebastain
>
> We recently encountered similar issue. At least we observed agent
> constantly died with OOM. Unfortunately, we are still with 1.2.X and it
> will be a while before we can totally move to Cassandra 2 series.
>
> Is there a backport patch to fix OOM in OpsCenter 5.1 branch? Please let
> us know because losing OpsCenter is a huge deal for administrator.
>
> Thank you.
>
>
>
> On Wed, Jul 15, 2015 at 6:28 PM, Mikhail Strebkov 
> wrote:
>
> Thanks, I think it got resolved after an update.
>
>
>
> Kind regards,
>
> Mikhail
>
>
>
> On Wed, Jul 15, 2015 at 2:04 PM, Sebastian Estevez <
> sebastian.este...@datastax.com> wrote:
>
> OpsCenter 5.2 has a couple of fixes that may result in the symptoms you
> described:
>
> http://docs.datas
> tax.com/en/opscenter/5.2/opsc/release_notes/opscReleaseNotes520.html
>
>
>
> · Fixed issues with agent OOM when storing metrics for large
> numbers of tables. (OPSC-5934
>
> · Improved handling of metrics overflow queue on agent.
> (OPSC-4618)
>
>
>
> It's also got a lot of other great new features --
> http://docs.datastax.com/en/opscenter/5.2/opsc/online_help/services/opscPerformanceService.html
>
>
>
> Let us know if this stops once you upgrade.
>
>
> All the best,
>
>
>
> *[image: Image removed by sender. datastax_logo.png]
> <http://www.datastax.com/>*
>
> Sebastián Estévez
>
> Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com
>
> [image: Image removed by sender. linkedin.png]
> <https://www.linkedin.com/company/datastax>[image: Image removed by
> sender. facebook.png] <https://www.facebook.com/datastax>[image: Image
> removed by sender. twitter.png] <https://twitter.com/datastax>[image:
> Image removed by sender. g+.png] 
> <https://plus.google.com/+Datastax/about>[image:
> Image removed by sender.] <http://feeds.feedburner.com/datastax>
>
>
> [image: Image removed by sender.] <http://cassandrasummit-datastax.com/>
>
>
>
> DataStax is the fastest, most scalable distributed database technology,
> delivering Apache Cassandra to the world’s most innovative enterprises.
> Datastax is built to be agile, always-on, and predictably scalable to any
> size. With more than 500 customers in 45 countries, DataStax is the
> database technology and transactional backbone of choice for the worlds
> most innovative companies such as Netflix, Adobe, Intuit, and eBay.
>
>
>
> On Tue, Jul 14, 2015 at 4:40 PM, Mikhail Strebkov 
> wrote:
>
> Looks like it dies with OOM:
> https://gist.github.com/kluyg/03785041e16333015c2c
>
>
>
> On Tue, Jul 14, 2015 at 12:01 PM, Mikhail Strebkov 
> wrote:
>
> OpsCenter 5.1.3 and datastax-agent-5.1.3-standalone.jar
>
>
>
> On Tue, Jul 14, 2015 at 12:00 PM, Sebastian Estevez <
> sebastian.este...@datastax.com> wrote:
>
> What version of the agents and what version of OpsCenter are you running?
>
> I recently saw something like this and upgrading to matching versions
> fixed the issue.
>
> On Jul 14, 2015

Re: OpsCenter datastax-agent 300% CPU

2015-07-15 Thread Sebastian Estevez

OpsCenter 5.2 has a couple of fixes that may result in the symptoms you 
described:
http://docs.datas
tax.com/en/opscenter/5.2/opsc/release_notes/opscReleaseNotes520.html


   - Fixed issues with agent OOM when storing metrics for large numbers of 
   tables. (OPSC-5934
   - Improved handling of metrics overflow queue on agent. (OPSC-4618)


It's also got a lot of other great new features -- 
http://docs.datastax.com/en/opscenter/5.2/opsc/online_help/services/opscPerformanceService.html

Let us know if this stops once you upgrade.

All the best,


[image: datastax_logo.png] <http://www.datastax.com/>

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png] <https://www.linkedin.com/company/datastax> [image: 
facebook.png] <https://www.facebook.com/datastax> [image: twitter.png] 
<https://twitter.com/datastax> [image: g+.png] 
<https://plus.google.com/+Datastax/about> 
<http://feeds.feedburner.com/datastax>

<http://cassandrasummit-datastax.com/>

DataStax is the fastest, most scalable distributed database technology, 
delivering Apache Cassandra to the world’s most innovative enterprises. 
Datastax is built to be agile, always-on, and predictably scalable to any 
size. With more than 500 customers in 45 countries, DataStax is the 
database technology and transactional backbone of choice for the worlds 
most innovative companies such as Netflix, Adobe, Intuit, and eBay. 

On Tue, Jul 14, 2015 at 4:40 PM, Mikhail Strebkov  
wrote:

> Looks like it dies with OOM: 
> https://gist.github.com/kluyg/03785041e16333015c2c
>
> On Tue, Jul 14, 2015 at 12:01 PM, Mikhail Strebkov  
> wrote:
>
>> OpsCenter 5.1.3 and datastax-agent-5.1.3-standalone.jar
>>
>> On Tue, Jul 14, 2015 at 12:00 PM, Sebastian Estevez <
>> sebastian.este...@datastax.com> wrote:
>>
>>> What version of the agents and what version of OpsCenter are you running?
>>>
>>> I recently saw something like this and upgrading to matching versions 
>>> fixed the issue.
>>> On Jul 14, 2015 2:58 PM, "Mikhail Strebkov"  wrote:
>>>
>>>> Hi everyone,
>>>>
>>>> Recently I've noticed that most of the nodes have OpsCenter agents 
>>>> running at 300% CPU. Each node has 4 cores, so agents are using 75% of 
>>>> total available CPU.
>>>>
>>>> We're running 5 nodes with OpenSource Cassandra 2.1.8 in AWS using 
>>>> Community AMI. OpsCenter version is 5.1.3. We're using Oracle Java version 
>>>> 1.8.0_45.
>>>>
>>>> *  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND*
>>>> 31501 cassandr  20   0 3599m 296m  14m S  *339*  2.0  48:20.39 
>>>> /opt/jdk/jdk1.8.0_45/bin/java -Xmx128M 
>>>> -Djclouds.mpu.parts.magnitude=10 
>>>> -Djclouds.mpu.parts.size=16777216 
>>>> -Dopscenter.ssl.trustStore=/var/lib/datastax-agent/ssl/agentKeyStore 
>>>> -Dopscenter.ssl.keyStore=/var/lib/datastax-agent/ssl/agentKeyStore 
>>>> -Dopscenter.ssl.keyStorePassword=opscenter 
>>>> -Dagent-pidfile=/var/run/datastax-agent/datastax-agent.pid 
>>>> -Dlog4j.configuration=file:/etc/datastax-agent/log4j.properties 
>>>> -Djava.security.auth.login.config=/etc/datastax-agent/kerberos.config -jar 
>>>> datastax-agent-5.1.3-standalone.jar 
>>>> /var/lib/datastax-agent/conf/address.yaml
>>>>
>>>> The logs from the agent looks strange to me: 
>>>> https://gist.github.com/kluyg/21f78af7adff0a940ed3
>>>>
>>>> The cluster itself seems to be fine, the load is small, nothing bad in 
>>>> Cassandra system.log.
>>>>
>>>> Does anyone know what to tune to bring it back to normal?
>>>>
>>>> Thanks,
>>>> Mikhail
>>>>
>>>
>>
>

Re: OpsCenter datastax-agent 300% CPU

2015-07-14 Thread Sebastian Estevez

What version of the agents and what version of OpsCenter are you running?

I recently saw something like this and upgrading to matching versions fixed
the issue.
On Jul 14, 2015 2:58 PM, "Mikhail Strebkov"  wrote:

> Hi everyone,
>
> Recently I've noticed that most of the nodes have OpsCenter agents running
> at 300% CPU. Each node has 4 cores, so agents are using 75% of total
> available CPU.
>
> We're running 5 nodes with OpenSource Cassandra 2.1.8 in AWS using
> Community AMI. OpsCenter version is 5.1.3. We're using Oracle Java version
> 1.8.0_45.
>
> *  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND*
> 31501 cassandr  20   0 3599m 296m  14m S  *339*  2.0  48:20.39
> /opt/jdk/jdk1.8.0_45/bin/java -Xmx128M -Djclouds.mpu.parts.magnitude=10
> -Djclouds.mpu.parts.size=16777216
> -Dopscenter.ssl.trustStore=/var/lib/datastax-agent/ssl/agentKeyStore
> -Dopscenter.ssl.keyStore=/var/lib/datastax-agent/ssl/agentKeyStore
> -Dopscenter.ssl.keyStorePassword=opscenter
> -Dagent-pidfile=/var/run/datastax-agent/datastax-agent.pid
> -Dlog4j.configuration=file:/etc/datastax-agent/log4j.properties
> -Djava.security.auth.login.config=/etc/datastax-agent/kerberos.config -jar
> datastax-agent-5.1.3-standalone.jar
> /var/lib/datastax-agent/conf/address.yaml
>
> The logs from the agent looks strange to me:
> https://gist.github.com/kluyg/21f78af7adff0a940ed3
>
> The cluster itself seems to be fine, the load is small, nothing bad in
> Cassandra system.log.
>
> Does anyone know what to tune to bring it back to normal?
>
> Thanks,
> Mikhail
>

Re: DROP Table

2015-07-14 Thread Sebastian Estevez

When you say schema to settle, do you means we provide proper consistency
level? I don't think there is a provision to do that in tool. Or I can
change the SYSTEM KEYSPACE definition of replication factor equal to number
of nodes?

I mean the nodes have to all be on the same schema version.

In the steps described below for correcting this problem - when you say
move data from old directory to new, do you mean move the .db file? It
will override the current file right?

Yes copy the db files. No they will not override existing data.

Do we have to rename the directory name to remove CFID i.e. just
column family name without CFID? After that, update the System table as
well?

No, follow my instructions precisely, do not rename the directories, just
copy to the new one and rm the old one.
On Jul 13, 2015 2:52 PM, "Saladi Naidu"  wrote:

> Sebastian,
> Thank you so much for providing detailed explanation. I still have some
> questions and I need to provide some clarifications
>
> 1. We do not have code that is creating the tables dynamically. All DDL
> operations are done through Datastax DevCenter tool. When you say schema to
> settle, do you means we provide proper consistency level? I don't think
> there is a provision to do that in tool. Or I can change the SYSTEM
> KEYSPACE definition of replication factor equal to number of nodes?
>
> 2. In the steps described below for correcting this problem - when you say
> move data from old directory to new, do you mean move the .db file? It
> will override the current file right?
>
> 3. Do we have to rename the directory name to remove CFID i.e. just
> column family name without CFID? After that, update the System table as
> well?
>
>
> Naidu Saladi
>
>   --
>  *From:* Sebastian Estevez 
> *To:* user@cassandra.apache.org; Saladi Naidu 
> *Sent:* Friday, July 10, 2015 5:25 PM
> *Subject:* Re: DROP Table
>
> #1 The cause of this problem is a CREATE TABLE statement collision. Do
> *not* generate tables dynamically from multiple clients, even with IF NOT
> EXISTS. First thing you need to do is fix your code so that this does not
> happen. Just create your tables manually from cqlsh allowing time for the
> schema to settle.
>
> #2 Here's the fix:
>
> 1) *Change your code to not automatically re-create tables (even with IF
> NOT EXISTS).*
>
> 2) Run a rolling restart to ensure schema matches across nodes. Run
> nodetool describecluster around your cluster. Check that there is only one
> schema version.
>
> ON EACH NODE:
> 3) Check your filesystem and see if you have two directories for the table
> in question in the data directory.
>
> If THERE ARE TWO OR MORE DIRECTORIES:
> 4)Identify from schema_column_families which cf ID is the "new" one
> (currently in use).
>
> cqlsh -e "select * from system.schema_column_families"|grep 
>
>
> 5) Move the data from the "old" one to the "new" one and remove the old
> directory.
>
> 6) If there are multiple "old" ones repeat 5 for every "old" directory.
>
> 7) run nodetool refresh
>
> IF THERE IS ONLY ONE DIRECTORY:
>
> No further action is needed.
>
> All the best,
>
> [image: datastax_logo.png] <http://www.datastax.com/>
> Sebastián Estévez
> Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com
> [image: linkedin.png] <https://www.linkedin.com/company/datastax> [image:
> facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
> <https://twitter.com/datastax> [image: g+.png]
> <https://plus.google.com/+Datastax/about>
> <http://feeds.feedburner.com/datastax>
>
> <http://cassandrasummit-datastax.com/>
>
> DataStax is the fastest, most scalable distributed database technology,
> delivering Apache Cassandra to the world’s most innovative enterprises.
> Datastax is built to be agile, always-on, and predictably scalable to any
> size. With more than 500 customers in 45 countries, DataStax is the
> database technology and transactional backbone of choice for the worlds
> most innovative companies such as Netflix, Adobe, Intuit, and eBay.
>
>
>
> On Fri, Jul 10, 2015 at 12:15 PM, Saladi Naidu 
> wrote:
>
> My understanding is that Cassandra File Structure follows below naming
> convention
>
> /cassandra/*data/   *
>
>
>
> Whereas our file structure is as below, each table has multiple names and
> when we drop tables and recreate these directories remain. Also when we
> dropped the table one node was down, when it came back, we tried to do
> Nodetool repair and repair kept failing  referring to CFID error listed
> below
>
>
> drwxr-xr-x. 16 cass

Re: DROP Table

2015-07-14 Thread Sebastian Estevez

Thank you Mikhail. I'll take a look at your utility.
On Jul 13, 2015 5:30 PM, "Mikhail Strebkov"  wrote:

> Hi Saladi,
>
> Recently I faced a similar problem, I had a lot of CFs to fix, so I wrote
> this: https://github.com/kluyg/cassandra-schema-fix
> I think it can be useful to you.
>
> Kind regards,
> Mikhail
>
> On Mon, Jul 13, 2015 at 11:51 AM, Saladi Naidu 
> wrote:
>
>> Sebastian,
>> Thank you so much for providing detailed explanation. I still have some
>> questions and I need to provide some clarifications
>>
>> 1. We do not have code that is creating the tables dynamically. All DDL
>> operations are done through Datastax DevCenter tool. When you say schema to
>> settle, do you means we provide proper consistency level? I don't think
>> there is a provision to do that in tool. Or I can change the SYSTEM
>> KEYSPACE definition of replication factor equal to number of nodes?
>>
>> 2. In the steps described below for correcting this problem - when you
>> say move data from old directory to new, do you mean move the .db file? It
>> will override the current file right?
>>
>> 3. Do we have to rename the directory name to remove CFID i.e. just
>> column family name without CFID? After that, update the System table as
>> well?
>>
>>
>> Naidu Saladi
>>
>>   --
>>  *From:* Sebastian Estevez 
>> *To:* user@cassandra.apache.org; Saladi Naidu 
>> *Sent:* Friday, July 10, 2015 5:25 PM
>> *Subject:* Re: DROP Table
>>
>> #1 The cause of this problem is a CREATE TABLE statement collision. Do
>> *not* generate tables dynamically from multiple clients, even with IF
>> NOT EXISTS. First thing you need to do is fix your code so that this does
>> not happen. Just create your tables manually from cqlsh allowing time for
>> the schema to settle.
>>
>> #2 Here's the fix:
>>
>> 1) *Change your code to not automatically re-create tables (even with IF
>> NOT EXISTS).*
>>
>> 2) Run a rolling restart to ensure schema matches across nodes. Run
>> nodetool describecluster around your cluster. Check that there is only one
>> schema version.
>>
>> ON EACH NODE:
>> 3) Check your filesystem and see if you have two directories for the
>> table in question in the data directory.
>>
>> If THERE ARE TWO OR MORE DIRECTORIES:
>> 4)Identify from schema_column_families which cf ID is the "new" one
>> (currently in use).
>>
>> cqlsh -e "select * from system.schema_column_families"|grep 
>>
>>
>> 5) Move the data from the "old" one to the "new" one and remove the old
>> directory.
>>
>> 6) If there are multiple "old" ones repeat 5 for every "old" directory.
>>
>> 7) run nodetool refresh
>>
>> IF THERE IS ONLY ONE DIRECTORY:
>>
>> No further action is needed.
>>
>> All the best,
>>
>> [image: datastax_logo.png] <http://www.datastax.com/>
>> Sebastián Estévez
>> Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com
>> [image: linkedin.png] <https://www.linkedin.com/company/datastax> [image:
>> facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
>> <https://twitter.com/datastax> [image: g+.png]
>> <https://plus.google.com/+Datastax/about>
>> <http://feeds.feedburner.com/datastax>
>>
>> <http://cassandrasummit-datastax.com/>
>>
>> DataStax is the fastest, most scalable distributed database technology,
>> delivering Apache Cassandra to the world’s most innovative enterprises.
>> Datastax is built to be agile, always-on, and predictably scalable to any
>> size. With more than 500 customers in 45 countries, DataStax is the
>> database technology and transactional backbone of choice for the worlds
>> most innovative companies such as Netflix, Adobe, Intuit, and eBay.
>>
>>
>>
>> On Fri, Jul 10, 2015 at 12:15 PM, Saladi Naidu 
>> wrote:
>>
>> My understanding is that Cassandra File Structure follows below naming
>> convention
>>
>> /cassandra/*data/   *
>>
>>
>>
>> Whereas our file structure is as below, each table has multiple names and
>> when we drop tables and recreate these directories remain. Also when we
>> dropped the table one node was down, when it came back, we tried to do
>> Nodetool repair and repair kept failing  referring to CFID error listed
>> below
>>
>>
>> drwxr-xr-x. 16 cass cas

Re: Cassandra OOM on joining existing ring

2015-07-13 Thread Sebastian Estevez

Are you on the azure premium storage?
http://www.datastax.com/2015/04/getting-started-with-azure-premium-storage-and-datastax-enterprise-dse

Secondary indexes are built for convenience not performance.
http://www.datastax.com/resources/data-modeling

What's your compaction strategy? Your nodes have to come up in order for
them to start compacting.
On Jul 13, 2015 1:11 AM, "Kunal Gangakhedkar" 
wrote:

> Hi,
>
> Looks like that is my primary problem - the sstable count for the
> daily_challenges column family is >5k. Azure had scheduled maintenance
> window on Sat. All the VMs got rebooted one by one - including the current
> cassandra one - and it's taking forever to bring cassandra back up online.
>
> Is there any way I can re-organize my existing data? so that I can bring
> down that count?
> I don't want to lose that data.
> If possible, can I do that while cassandra is down? As I mentioned, it's
> taking forever to get the service up - it's stuck in reading those 5k
> sstable (+ another 5k of corresponding secondary index) files. :(
> Oh, did I mention I'm new to cassandra?
>
> Thanks,
> Kunal
>
> Kunal
>
> On 11 July 2015 at 03:29, Sebastian Estevez <
> sebastian.este...@datastax.com> wrote:
>
>> #1
>>
>>> There is one table - daily_challenges - which shows compacted partition
>>> max bytes as ~460M and another one - daily_guest_logins - which shows
>>> compacted partition max bytes as ~36M.
>>
>>
>> 460 is high, I like to keep my partitions under 100mb when possible. I've
>> seen worse though. The fix is to add something else (maybe month or week or
>> something) into your partition key:
>>
>>  PRIMARY KEY ((segment_type, something_else), date, user_id, sess_id)
>>
>> #2 looks like your jam version is 3 per your env.sh so you're probably
>> okay to copy the env.sh over from the C* 3.0 link I shared once you
>> uncomment and tweak the MAX_HEAP. If there's something wrong your node
>> won't come up. tail your logs.
>>
>>
>>
>> All the best,
>>
>>
>> [image: datastax_logo.png] <http://www.datastax.com/>
>>
>> Sebastián Estévez
>>
>> Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com
>>
>> [image: linkedin.png] <https://www.linkedin.com/company/datastax> [image:
>> facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
>> <https://twitter.com/datastax> [image: g+.png]
>> <https://plus.google.com/+Datastax/about>
>> <http://feeds.feedburner.com/datastax>
>>
>> <http://cassandrasummit-datastax.com/>
>>
>> DataStax is the fastest, most scalable distributed database technology,
>> delivering Apache Cassandra to the world’s most innovative enterprises.
>> Datastax is built to be agile, always-on, and predictably scalable to any
>> size. With more than 500 customers in 45 countries, DataStax is the
>> database technology and transactional backbone of choice for the worlds
>> most innovative companies such as Netflix, Adobe, Intuit, and eBay.
>>
>> On Fri, Jul 10, 2015 at 2:44 PM, Kunal Gangakhedkar <
>> kgangakhed...@gmail.com> wrote:
>>
>>> And here is my cassandra-env.sh
>>> https://gist.github.com/kunalg/2c092cb2450c62be9a20
>>>
>>> Kunal
>>>
>>> On 11 July 2015 at 00:04, Kunal Gangakhedkar 
>>> wrote:
>>>
>>>> From jhat output, top 10 entries for "Instance Count for All Classes
>>>> (excluding platform)" shows:
>>>>
>>>> 2088223 instances of class org.apache.cassandra.db.BufferCell
>>>> 1983245 instances of class
>>>> org.apache.cassandra.db.composites.CompoundSparseCellName
>>>> 1885974 instances of class
>>>> org.apache.cassandra.db.composites.CompoundDenseCellName
>>>> 63 instances of class
>>>> org.apache.cassandra.io.sstable.IndexHelper$IndexInfo
>>>> 503687 instances of class org.apache.cassandra.db.BufferDeletedCell
>>>> 378206 instances of class org.apache.cassandra.cql3.ColumnIdentifier
>>>> 101800 instances of class org.apache.cassandra.utils.concurrent.Ref
>>>> 101800 instances of class
>>>> org.apache.cassandra.utils.concurrent.Ref$State
>>>> 90704 instances of class
>>>> org.apache.cassandra.utils.concurrent.Ref$GlobalState
>>>> 71123 instances of class org.apache.cassandra.db.BufferDecoratedKey
>>>>
>>>> At the bottom of the page, it shows:
>>>> Total of

Re: DROP Table

2015-07-10 Thread Sebastian Estevez

#1 The cause of this problem is a CREATE TABLE statement collision. Do
*not* generate
tables dynamically from multiple clients, even with IF NOT EXISTS. First
thing you need to do is fix your code so that this does not happen. Just
create your tables manually from cqlsh allowing time for the schema to
settle.

#2 Here's the fix:

1) *Change your code to not automatically re-create tables (even with IF
NOT EXISTS).*

2) Run a rolling restart to ensure schema matches across nodes. Run
nodetool describecluster around your cluster. Check that there is only one
schema version.

ON EACH NODE:
3) Check your filesystem and see if you have two directories for the table
in question in the data directory.

If THERE ARE TWO OR MORE DIRECTORIES:
4)Identify from schema_column_families which cf ID is the "new" one
(currently in use).

cqlsh -e "select * from system.schema_column_families"|grep 

5) Move the data from the "old" one to the "new" one and remove the old
directory.

6) If there are multiple "old" ones repeat 5 for every "old" directory.

7) run nodetool refresh

IF THERE IS ONLY ONE DIRECTORY:

No further action is needed.

All the best,

[image: datastax_logo.png] 

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png]  [image:
facebook.png]  [image: twitter.png]
 [image: g+.png]

DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Fri, Jul 10, 2015 at 12:15 PM, Saladi Naidu 
wrote:

> My understanding is that Cassandra File Structure follows below naming
> convention
>
> /cassandra/*data/   *
>
>
>
> Whereas our file structure is as below, each table has multiple names and
> when we drop tables and recreate these directories remain. Also when we
> dropped the table one node was down, when it came back, we tried to do
> Nodetool repair and repair kept failing  referring to CFID error listed
> below
>
>
> drwxr-xr-x. 16 cass cass 4096 May 24 06:49 ../
> drwxr-xr-x.  4 cass cass 4096 Jul  2 11:09
> application_by_user-e0eec95019a211e58b954ffc8e9bfaa6/
> drwxr-xr-x.  2 cass cass 4096 Jun 25 10:15 application_info-
> 4dba2bf0054f11e58b954ffc8e9bfaa6/
> drwxr-xr-x.  4 cass cass 4096 Jul  2 11:09
> application_info-a0ee65d019a311e58b954ffc8e9bfaa6/
> drwxr-xr-x.  4 cass cass 4096 Jul  2 11:09
> configproperties-228ea2e0c13811e4aa1d4ffc8e9bfaa6/
> drwxr-xr-x.  4 cass cass 4096 Jul  2 11:09
> user_activation-95d005f019a311e58b954ffc8e9bfaa6/
> drwxr-xr-x.  3 cass cass 4096 Jun 25 10:16
> user_app_permission-9fddcd62ffbe11e4a25a45259f96ec68/
> drwxr-xr-x.  4 cass cass 4096 Jul  2 11:09
> user_credential-86cfff1019a311e58b954ffc8e9bfaa6/
> drwxr-xr-x.  4 cass cass 4096 Jul  2 11:09
> user_info-2fa076221b1011e58b954ffc8e9bfaa6/
> drwxr-xr-x.  2 cass cass 4096 Jun 25 10:15
> user_info-36028c00054f11e58b954ffc8e9bfaa6/
> drwxr-xr-x.  3 cass cass 4096 Jun 25 10:15
> user_info-fe1d7b101a5711e58b954ffc8e9bfaa6/
> drwxr-xr-x.  4 cass cass 4096 Jun 25 10:16
> user_role-9ed0ca30ffbe11e4b71d09335ad2d5a9/
>
>
> WARN  [Thread-2579] 2015-07-02 16:02:27,523 IncomingTcpConnection.java:91
> - UnknownColumnFamilyException reading from socket; closing
> org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find
> cfId=218e3c90-1b0e-11e5-a34b-d7c17b3e318a
> at
> org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:322)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:302)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:330)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:272)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at org.apache.cassandra.net.MessageIn.read(MessageIn.java:99)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:168)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> org.apache.cassandra.net.IncomingT

Re: Cassandra OOM on joining existing ring

2015-07-10 Thread Sebastian Estevez

#1

> There is one table - daily_challenges - which shows compacted partition
> max bytes as ~460M and another one - daily_guest_logins - which shows
> compacted partition max bytes as ~36M.


460 is high, I like to keep my partitions under 100mb when possible. I've
seen worse though. The fix is to add something else (maybe month or week or
something) into your partition key:

 PRIMARY KEY ((segment_type, something_else), date, user_id, sess_id)

#2 looks like your jam version is 3 per your env.sh so you're probably okay
to copy the env.sh over from the C* 3.0 link I shared once you uncomment
and tweak the MAX_HEAP. If there's something wrong your node won't come up.
tail your logs.



All the best,


[image: datastax_logo.png] <http://www.datastax.com/>

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png] <https://www.linkedin.com/company/datastax> [image:
facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
<https://twitter.com/datastax> [image: g+.png]
<https://plus.google.com/+Datastax/about>
<http://feeds.feedburner.com/datastax>

<http://cassandrasummit-datastax.com/>

DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Fri, Jul 10, 2015 at 2:44 PM, Kunal Gangakhedkar  wrote:

> And here is my cassandra-env.sh
> https://gist.github.com/kunalg/2c092cb2450c62be9a20
>
> Kunal
>
> On 11 July 2015 at 00:04, Kunal Gangakhedkar 
> wrote:
>
>> From jhat output, top 10 entries for "Instance Count for All Classes
>> (excluding platform)" shows:
>>
>> 2088223 instances of class org.apache.cassandra.db.BufferCell
>> 1983245 instances of class
>> org.apache.cassandra.db.composites.CompoundSparseCellName
>> 1885974 instances of class
>> org.apache.cassandra.db.composites.CompoundDenseCellName
>> 63 instances of class
>> org.apache.cassandra.io.sstable.IndexHelper$IndexInfo
>> 503687 instances of class org.apache.cassandra.db.BufferDeletedCell
>> 378206 instances of class org.apache.cassandra.cql3.ColumnIdentifier
>> 101800 instances of class org.apache.cassandra.utils.concurrent.Ref
>> 101800 instances of class org.apache.cassandra.utils.concurrent.Ref$State
>>
>> 90704 instances of class
>> org.apache.cassandra.utils.concurrent.Ref$GlobalState
>> 71123 instances of class org.apache.cassandra.db.BufferDecoratedKey
>>
>> At the bottom of the page, it shows:
>> Total of 8739510 instances occupying 193607512 bytes.
>> JFYI.
>>
>> Kunal
>>
>> On 10 July 2015 at 23:49, Kunal Gangakhedkar 
>> wrote:
>>
>>> Thanks for quick reply.
>>>
>>> 1. I don't know what are the thresholds that I should look for. So, to
>>> save this back-and-forth, I'm attaching the cfstats output for the keyspace.
>>>
>>> There is one table - daily_challenges - which shows compacted partition
>>> max bytes as ~460M and another one - daily_guest_logins - which shows
>>> compacted partition max bytes as ~36M.
>>>
>>> Can that be a problem?
>>> Here is the CQL schema for the daily_challenges column family:
>>>
>>> CREATE TABLE app_10001.daily_challenges (
>>> segment_type text,
>>> date timestamp,
>>> user_id int,
>>> sess_id text,
>>> data text,
>>> deleted boolean,
>>> PRIMARY KEY (segment_type, date, user_id, sess_id)
>>> ) WITH CLUSTERING ORDER BY (date DESC, user_id ASC, sess_id ASC)
>>> AND bloom_filter_fp_chance = 0.01
>>> AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
>>> AND comment = ''
>>> AND compaction = {'min_threshold': '4', 'class':
>>> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
>>> 'max_threshold': '32'}
>>> AND compression = {'sstable_compression':
>>> 'org.apache.cassandra.io.compress.LZ4Compressor'}
>>> AND dclocal_read_repair_chance = 0.1
>>> AND default_time_to_live = 0
>>> AND gc_grace_seconds = 864000
>>> AND max_index_interval = 2048
>>> AND memtable_flush

Re: Cassandra OOM on joining existing ring

2015-07-10 Thread Sebastian Estevez

1. You want to look at # of sstables in cfhistograms or in cfstats look at:
Compacted partition maximum bytes
Maximum live cells per slice

2) No, here's the env.sh from 3.0 which should work with some tweaks:
https://github.com/tobert/cassandra/blob/0f70469985d62aeadc20b41dc9cdc9d72a035c64/conf/cassandra-env.sh

You'll at least have to modify the jamm version to what's in yours. I think
it's 2.5



All the best,


[image: datastax_logo.png] <http://www.datastax.com/>

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png] <https://www.linkedin.com/company/datastax> [image:
facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
<https://twitter.com/datastax> [image: g+.png]
<https://plus.google.com/+Datastax/about>
<http://feeds.feedburner.com/datastax>

<http://cassandrasummit-datastax.com/>

DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Fri, Jul 10, 2015 at 1:42 PM, Kunal Gangakhedkar  wrote:

> Thanks, Sebastian.
>
> Couple of questions (I'm really new to cassandra):
> 1. How do I interpret the output of 'nodetool cfstats' to figure out the
> issues? Any documentation pointer on that would be helpful.
>
> 2. I'm primarily a python/c developer - so, totally clueless about JVM
> environment. So, please bare with me as I would need a lot of hand-holding.
> Should I just copy+paste the settings you gave and try to restart the
> failing cassandra server?
>
> Thanks,
> Kunal
>
> On 10 July 2015 at 22:35, Sebastian Estevez <
> sebastian.este...@datastax.com> wrote:
>
>> #1 You need more information.
>>
>> a) Take a look at your .hprof file (memory heap from the OOM) with an
>> introspection tool like jhat or visualvm or java flight recorder and see
>> what is using up your RAM.
>>
>> b) How big are your large rows (use nodetool cfstats on each node). If
>> your data model is bad, you are going to have to re-design it no matter
>> what.
>>
>> #2 As a possible workaround try using the G1GC allocator with the
>> settings from c* 3.0 instead of CMS. I've seen lots of success with it
>> lately (tl;dr G1GC is much simpler than CMS and almost as good as a finely
>> tuned CMS). *Note:* Use it with the latest Java 8 from Oracle. Do *not*
>> set the newgen size for G1 sets it dynamically:
>>
>> # min and max heap sizes should be set to the same value to avoid
>>> # stop-the-world GC pauses during resize, and so that we can lock the
>>> # heap in memory on startup to prevent any of it from being swapped
>>> # out.
>>> JVM_OPTS="$JVM_OPTS -Xms${MAX_HEAP_SIZE}"
>>> JVM_OPTS="$JVM_OPTS -Xmx${MAX_HEAP_SIZE}"
>>>
>>> # Per-thread stack size.
>>> JVM_OPTS="$JVM_OPTS -Xss256k"
>>>
>>> # Use the Hotspot garbage-first collector.
>>> JVM_OPTS="$JVM_OPTS -XX:+UseG1GC"
>>>
>>> # Have the JVM do less remembered set work during STW, instead
>>> # preferring concurrent GC. Reduces p99.9 latency.
>>> JVM_OPTS="$JVM_OPTS -XX:G1RSetUpdatingPauseTimePercent=5"
>>>
>>> # The JVM maximum is 8 PGC threads and 1/4 of that for ConcGC.
>>> # Machines with > 10 cores may need additional threads.
>>> # Increase to <= full cores (do not count HT cores).
>>> #JVM_OPTS="$JVM_OPTS -XX:ParallelGCThreads=16"
>>> #JVM_OPTS="$JVM_OPTS -XX:ConcGCThreads=16"
>>>
>>> # Main G1GC tunable: lowering the pause target will lower throughput and
>>> vise versa.
>>> # 200ms is the JVM default and lowest viable setting
>>> # 1000ms increases throughput. Keep it smaller than the timeouts in
>>> cassandra.yaml.
>>> JVM_OPTS="$JVM_OPTS -XX:MaxGCPauseMillis=500"
>>> # Do reference processing in parallel GC.
>>> JVM_OPTS="$JVM_OPTS -XX:+ParallelRefProcEnabled"
>>>
>>> # This may help eliminate STW.
>>> # The default in Hotspot 8u40 is 40%.
>>> #JVM_OPTS="$JVM_OPTS -XX:InitiatingHeapOccupancyPercent=25"
>>>
>>> # For workloads that do large allocations, increasing the region
>>> # size may make things more efficient. Otherwise, let the JVM
>>&

1 2 >

1 - 100 of 137 matches

Mail list logo