Re: Cassandra 2.x Stability

2016-12-02 Thread Benjamin Roth
No worries.
I added some patches to these tickets after having tested them yesterday
with our production cluster.
I think this will be a huge step for MV stability.

Anybody welcome to post comments on them or give me a review:
CASSANDRA-12888, CASSANDRA-12905, CASSANDRA-12984

Thanks folks!

2016-12-01 19:14 GMT+01:00 Kai Wang :

> Ben, I just read through those two tickets. It's scarier than I thought.
> Thank you for all the investigations and comments.
>
> On Thu, Dec 1, 2016 at 10:31 AM, Benjamin Roth 
> wrote:
>
>> A little experience report on MVs:
>>
>> We use them in production (3.10-trunk) and they work really well on
>> normal read/write operations but streaming operations (bootstrap, repair,
>> rebuild, decommision) can kill your cluster and/or your nerves.
>> We will stay with MVs as we need them and want them.
>> I rolled out a patch on MV streaming on our production cluster a few
>> hours ago as we had problems with bootstrapping new nodes.
>>
>> Before:
>> - Error log was completely flooded with WTEs
>> - Bootstrap either failed due to exceptions or wasn't even close to
>> finish after 24h - it just did not work
>>
>> After
>> - Bootstrap finished without a single error log after less than 5:30h
>>
>> I started to roll out that patch to the whole cluster to see how repairs
>> are affected. Will keep you updated.
>>
>> There is no dedicated JIRA issue assigned as it addresses multiple
>> tickets like CASSANDRA-12905 + CASSANDRA-12888
>>
>>
>> 2016-12-01 16:21 GMT+01:00 Jonathan Haddad :
>>
>>> I agree with everything you just said, Kai.  I'd start a new project
>>> with 3.0.10.  I'd stay away from MVs though.
>>>
>>> On Thu, Dec 1, 2016 at 10:19 AM Kai Wang  wrote:
>>>
 Just based on a few observations on this list. Not one week goes by
 without people asking which release is the most stable on 3.x line. Folks
 at instaclustr also provide their own 3.x fork for stability issues. etc

 We developers already have enough to think about. I really don't feel
 like spending time researching which release of C* I should choose. So for
 me, 2.2.x is the choice in production.

 That being said, I have nothing against 3.x. I do like its new storage
 engine. If I start a brand new project today with zero previous C*
 experience, I probably would choose 3.0.10 as my starting point. However if
 I were to upgrade to 3.x, I would have to test it thoroughly in a dev
 environment with real production load and monitor it very closely on
 performance, compaction, repair, bootstrap, replacing etc. Data is simply
 too important to take chances with.


 On Thu, Dec 1, 2016 at 9:38 AM, Shalom Sagges 
 wrote:

 Hey Kai,

 Thanks for the info. Can you please elaborate on the reasons you'd pick
 2.2.6 over 3.0?


 Shalom Sagges
 DBA
 T: +972-74-700-4035 <+972%2074-700-4035>
 
   We
 Create Meaningful Connections

 


 On Thu, Dec 1, 2016 at 2:26 PM, Kai Wang  wrote:

 I have been running 2.2.6 in production. As of today I would still pick
 it over 3.x for production.

 On Nov 30, 2016 5:42 AM, "Shalom Sagges" 
 wrote:

 Hi Everyone,

 I'm about to upgrade our 2.0.14 version to a newer 2.x version.
 At first I thought of upgrading to 2.2.8, but I'm not sure how stable
 it is, as I understand the 2.2 version was supposed to be a sort of beta
 version for 3.0 feature-wise, whereas 3.0 upgrade will mainly handle the
 storage modifications (please correct me if I'm wrong).

 So my question is, if I need a 2.x version (can't upgrade to 3 due to
 client considerations), which one should I choose, 2.1.x or 2.2.x? (I'm
 don't require any new features available in 2.2).

 Thanks!

 Shalom Sagges
 DBA
 T: +972-74-700-4035 <+972%2074-700-4035>
 
   We
 Create Meaningful Connections

 


 This message may contain confidential and/or privileged information.
 If you are not the addressee or authorized to receive this on behalf of
 the addressee you must not use, copy, disclose or take action based on this
 message or any information herein.
 If you have received this message in error, please advise the sender
 immediately by reply email and delete this message. Thank you.



 This message may contain confidential and/or privileged information.
 If you are not the addressee or authorized t

Re: Which version is stable enough for production environment?

2016-12-02 Thread Hugo José Pinto
All,

Many thanks for this enlightening thread.

We're about to go live with a client for a pre-production environment, and
must decide on which 3.x version to use. We will probably need to perform
regular repairs, so we are obviously worried about both CASSANDRA-12905 and
CASSANDRA-12888 that Benjamin referred to.

Hence, the two golden questions:

1) Are these issues already present in 3.0.x?

2) What would be the best 3.x version to put in production at this moment?

Many thanks for any help you can come up with,

--
Hugo José Pinto


>
> LeveledCompaction: Have you checked if there where major changes in the
> LeveledStrategy between 2.x and 3.x?
>
> 2016-11-30 21:04 GMT+01:00 Harikrishnan Pillai :
>
>> https://issues.apache.org/jira/browse/CASSANDRA-12728
>>
>> [CASSANDRA-12728] Handling partially written hint files ...
>> 
>> issues.apache.org
>> Cassandra; CASSANDRA-12728; Handling partially written hint files. Agile
>> Board; Awaiting Feedback; Export
>> https://issues.apache.org/jira/browse/CASSANDRA-12844
>>
>>
>> Also when i testes some of our write heavy workload Leveled Compaction
>> was not keeping up.With same system settings 2.1.16 performs better and all
>> levels was properly aligned.
>> --
>> *From:* Benjamin Roth 
>> *Sent:* Tuesday, November 29, 2016 11:20:19 PM
>> *To:* user@cassandra.apache.org
>> *Subject:* Re: Which version is stable enough for production environment?
>>
>> What are the compaction issues / hint corruprions you encountered? Are
>> there JIRA tickets for it?
>> I am curios cause I use 3.10 (trunk) in production.
>>
>> For anyone who is planning to use MVs:
>> They basically work. We use them in production since some months, BUT
>> (it's a quite big one) maintainance is a pain. Bootstrapping and repairs
>> may be - depending on the model, config, amount of data - really, really
>> painful. I'm currently investigating intensively.
>>
>> 2016-11-30 3:11 GMT+01:00 Harikrishnan Pillai :
>>
>>> 3.0 has "off the heap memtable" impl removed and if you have a
>>> requirement for this,its not available.If you don't have the requirement
>>> 3.0.9 can be tried out. 3.9 version we did some testing and find lot issues
>>> in compaction,hint corruption etc.
>>>
>>> Regards
>>>
>>> Hari
>>>
>>>
>>> --
>>> *From:* Discovery 
>>> *Sent:* Tuesday, November 29, 2016 5:59 PM
>>> *To:* user
>>> *Subject:* Re: Which version is stable enough for production
>>> environment?
>>>
>>> Why version 3.x is not recommended?  Thanks.
>>>
>>>
>>> -- Original --
>>> *From: * "Harikrishnan Pillai";;
>>> *Date: * Wed, Nov 30, 2016 09:57 AM
>>> *To: * "user";
>>> *Subject: * Re: Which version is stable enough for production
>>> environment?
>>>
>>> Cassandra 2.1.16
>>>
>>>
>>> --
>>> *From:* Discovery 
>>> *Sent:* Tuesday, November 29, 2016 5:42 PM
>>> *To:* user
>>> *Subject:* Which version is stable enough for production environment?
>>>
>>> Hi Cassandra Experts,
>>>
>>>   We prepare to deploy Cassandra in production env, but
>>> we can not confirm which version is stable and recommended, could someone
>>> in this mail list give the suggestion? Thanks in advance!
>>>
>>>
>>> Best Regards
>>> Discovery
>>> 11/30/2016
>>>
>>
>>
>>
>> --
>> Benjamin Roth
>> Prokurist
>>
>> Jaumo GmbH · www.jaumo.com
>> Wehrstraße 46 · 73035 Göppingen · Germany
>> Phone +49 7161 304880-6 <07161%203048806> · Fax +49 7161 304880-1
>> <07161%203048801>
>> AG Ulm · HRB 731058 · Managing Director: Jens Kammerer
>>
>
>
>
> --
> Benjamin Roth
> Prokurist
>
> Jaumo GmbH · www.jaumo.com
> Wehrstraße 46 · 73035 Göppingen · Germany
> Phone +49 7161 304880-6 <+49%207161%203048806> · Fax +49 7161 304880-1
> <+49%207161%203048801>
> AG Ulm · HRB 731058 · Managing Director: Jens Kammerer
>


Re: Which version is stable enough for production environment?

2016-12-02 Thread Benjamin Roth
These issues exist since the very first implementation of MVs in *ALL* CS
versions.
If you want to use MVs, you may want to wait until these issues are
officially resolved. For testing or pre-prod, you could checkout
https://github.com/Jaumo/cassandra/commits/CASSANDRA-12905. I have fixed
these issues inofficially and have this version currently running on our
own cluster - but no warranties.

If you don't need MVs, you can for example use the Instaclustr 3.7 LTS
version as announced earlier in this thread.

2016-12-02 14:11 GMT+01:00 Hugo José Pinto :

> All,
>
> Many thanks for this enlightening thread.
>
> We're about to go live with a client for a pre-production environment, and
> must decide on which 3.x version to use. We will probably need to perform
> regular repairs, so we are obviously worried about both CASSANDRA-12905
> and CASSANDRA-12888 that Benjamin referred to.
>
> Hence, the two golden questions:
>
> 1) Are these issues already present in 3.0.x?
>
> 2) What would be the best 3.x version to put in production at this moment?
>
> Many thanks for any help you can come up with,
>
> --
> Hugo José Pinto
>
>
>>
>> LeveledCompaction: Have you checked if there where major changes in the
>> LeveledStrategy between 2.x and 3.x?
>>
>> 2016-11-30 21:04 GMT+01:00 Harikrishnan Pillai :
>>
>>> https://issues.apache.org/jira/browse/CASSANDRA-12728
>>>
>>> [CASSANDRA-12728] Handling partially written hint files ...
>>> 
>>> issues.apache.org
>>> Cassandra; CASSANDRA-12728; Handling partially written hint files. Agile
>>> Board; Awaiting Feedback; Export
>>> https://issues.apache.org/jira/browse/CASSANDRA-12844
>>>
>>>
>>> Also when i testes some of our write heavy workload Leveled Compaction
>>> was not keeping up.With same system settings 2.1.16 performs better and all
>>> levels was properly aligned.
>>> --
>>> *From:* Benjamin Roth 
>>> *Sent:* Tuesday, November 29, 2016 11:20:19 PM
>>> *To:* user@cassandra.apache.org
>>> *Subject:* Re: Which version is stable enough for production
>>> environment?
>>>
>>> What are the compaction issues / hint corruprions you encountered? Are
>>> there JIRA tickets for it?
>>> I am curios cause I use 3.10 (trunk) in production.
>>>
>>> For anyone who is planning to use MVs:
>>> They basically work. We use them in production since some months, BUT
>>> (it's a quite big one) maintainance is a pain. Bootstrapping and repairs
>>> may be - depending on the model, config, amount of data - really, really
>>> painful. I'm currently investigating intensively.
>>>
>>> 2016-11-30 3:11 GMT+01:00 Harikrishnan Pillai :
>>>
 3.0 has "off the heap memtable" impl removed and if you have a
 requirement for this,its not available.If you don't have the requirement
 3.0.9 can be tried out. 3.9 version we did some testing and find lot issues
 in compaction,hint corruption etc.

 Regards

 Hari


 --
 *From:* Discovery 
 *Sent:* Tuesday, November 29, 2016 5:59 PM
 *To:* user
 *Subject:* Re: Which version is stable enough for production
 environment?

 Why version 3.x is not recommended?  Thanks.


 -- Original --
 *From: * "Harikrishnan Pillai";;
 *Date: * Wed, Nov 30, 2016 09:57 AM
 *To: * "user";
 *Subject: * Re: Which version is stable enough for production
 environment?

 Cassandra 2.1.16


 --
 *From:* Discovery 
 *Sent:* Tuesday, November 29, 2016 5:42 PM
 *To:* user
 *Subject:* Which version is stable enough for production environment?

 Hi Cassandra Experts,

   We prepare to deploy Cassandra in production env, but
 we can not confirm which version is stable and recommended, could someone
 in this mail list give the suggestion? Thanks in advance!


 Best Regards
 Discovery
 11/30/2016

>>>
>>>
>>>
>>> --
>>> Benjamin Roth
>>> Prokurist
>>>
>>> Jaumo GmbH · www.jaumo.com
>>> Wehrstraße 46 · 73035 Göppingen · Germany
>>> Phone +49 7161 304880-6 <07161%203048806> · Fax +49 7161 304880-1
>>> <07161%203048801>
>>> AG Ulm · HRB 731058 · Managing Director: Jens Kammerer
>>>
>>
>>
>>
>> --
>> Benjamin Roth
>> Prokurist
>>
>> Jaumo GmbH · www.jaumo.com
>> Wehrstraße 46 · 73035 Göppingen · Germany
>> Phone +49 7161 304880-6 <+49%207161%203048806> · Fax +49 7161 304880-1
>> <+49%207161%203048801>
>> AG Ulm · HRB 731058 · Managing Director: Jens Kammerer
>>
>
>


-- 
Benjamin Roth
Prokurist

Jaumo GmbH · www.jaumo.com
Wehrstraße 46 · 73035 Göppingen · Germany
Phone +49 7161 304880-6 · Fax +49 7161 304880-1
AG Ulm · HRB 731058 · Managing Director: Jens Kammerer


Re: Why does `now()` produce different times within the same query?

2016-12-02 Thread Edward Capriolo
On Thu, Dec 1, 2016 at 11:09 AM, Sylvain Lebresne 
wrote:

> On Thu, Dec 1, 2016 at 4:44 PM, Edward Capriolo 
> wrote:
>
>>
>> I am not sure you saw my reply on thread but I believe everyone's needs
>> can be met I will copy that here:
>>
>
> I saw it, but the real problem that was raised initially was not that of
> UDF and of allowing both behavior. It's a matter of people being confused
> by the behavior of a non-UDF function, now(), and suggesting it should be
> changed.
>
> The Hive idea is interesting I guess, and we can switch to discussing
> that, but it's a different problem really and I'm not a fond of derailing
> threads. I will just note though that if we're not talking about a
> confusion issue but rather how to get a timeuuid to be fixed within a
> statement, then there is much much more trivial solution: generate it
> client side. The `now()` function is a small convenience but there is
> nothing you cannot do without it client side, and that actually basically
> stands for almost any use of (non aggregate) function in Cassandra
> currently.
>
>
>>
>>
>> "Food for thought: Hive's UDFs introduced an annotation
>> @UDFType(deterministic = false)
>>
>> http://dmtolpeko.com/2014/10/15/invoking-stateful-udf-at-map
>> -and-reduce-side-in-hive/
>>
>> The effect is the query planner can see when such a UDF is in use and
>> determine the value once at the start of a very long query."
>>
>> Essentially hive had a similar if not identical problem, during a long
>> running distributed process like map/reduce some users wanted the semantics
>> of:
>>
>> 1) Each call should have a new timestamps
>>
>> While other users wanted the semantics of:
>>
>> 2) Each call should generate the same timestamp
>>
>> The solution implemented was to add an annotation to udf such that the
>> query planner would pick up the annotation and act accordingly.
>>
>> (Here is a related issue https://issues.apache.org/jira/browse/HIVE-1986
>>
>> As a result you can essentially implement two UDFS
>>
>> @UDFType(deterministic = false)
>> public class UDFNow
>>
>> and for the other people
>>
>> @UDFType(deterministic = true)
>> public class UDFNowOnce extends UDFNow
>>
>> Both user cases are met in a sensible way.
>>
>
>
The `now()` function is a small convenience but there is nothing you cannot
do without it client side, and that actually basically stands for almost
any use of (non aggregate) function in Cassandra currently.

Casandra's changing philosophy over which entity should create such
information client/server/driver does not make this problem easy.

If you take into account that you have users who do not understand all the
intricacy of uuid the problem is compounded. IE How does one generate a
UUID each c#, python, java etc? with the 47 random bits of bla bla. That is
not super easy information to find. Maybe you find a stack overflow post
that actually gives bad advice etc.

Many times in Cassandra you are using a uuid because you do not have a
unique key in the insert and you wish to create one. If you are inserting
more then a single record using that same UUID and you do not want the
burden of wanting to do it yourself you would have to do write>>read>>write
which is an anti-pattern.


Re: Why does `now()` produce different times within the same query?

2016-12-02 Thread Jonathan Haddad
This isn't about using the same UUID though. It's about the timestamp bits
in the UUID.

What the use case is for generating multiple UUIDs in a single row? Why do
you need to extract the timestamp out of both?
On Fri, Dec 2, 2016 at 10:24 AM Edward Capriolo 
wrote:

>
> On Thu, Dec 1, 2016 at 11:09 AM, Sylvain Lebresne 
> wrote:
>
> On Thu, Dec 1, 2016 at 4:44 PM, Edward Capriolo 
> wrote:
>
>
> I am not sure you saw my reply on thread but I believe everyone's needs
> can be met I will copy that here:
>
>
> I saw it, but the real problem that was raised initially was not that of
> UDF and of allowing both behavior. It's a matter of people being confused
> by the behavior of a non-UDF function, now(), and suggesting it should be
> changed.
>
> The Hive idea is interesting I guess, and we can switch to discussing
> that, but it's a different problem really and I'm not a fond of derailing
> threads. I will just note though that if we're not talking about a
> confusion issue but rather how to get a timeuuid to be fixed within a
> statement, then there is much much more trivial solution: generate it
> client side. The `now()` function is a small convenience but there is
> nothing you cannot do without it client side, and that actually basically
> stands for almost any use of (non aggregate) function in Cassandra
> currently.
>
>
>
>
> "Food for thought: Hive's UDFs introduced an annotation  
> @UDFType(deterministic
> = false)
>
>
> http://dmtolpeko.com/2014/10/15/invoking-stateful-udf-at-map-and-reduce-side-in-hive/
>
> The effect is the query planner can see when such a UDF is in use and
> determine the value once at the start of a very long query."
>
> Essentially hive had a similar if not identical problem, during a long
> running distributed process like map/reduce some users wanted the semantics
> of:
>
> 1) Each call should have a new timestamps
>
> While other users wanted the semantics of:
>
> 2) Each call should generate the same timestamp
>
> The solution implemented was to add an annotation to udf such that the
> query planner would pick up the annotation and act accordingly.
>
> (Here is a related issue https://issues.apache.org/jira/browse/HIVE-1986
>
> As a result you can essentially implement two UDFS
>
> @UDFType(deterministic = false)
> public class UDFNow
>
> and for the other people
>
> @UDFType(deterministic = true)
> public class UDFNowOnce extends UDFNow
>
> Both user cases are met in a sensible way.
>
>
>
> The `now()` function is a small convenience but there is nothing you
> cannot do without it client side, and that actually basically stands for
> almost any use of (non aggregate) function in Cassandra currently.
>
> Casandra's changing philosophy over which entity should create such
> information client/server/driver does not make this problem easy.
>
> If you take into account that you have users who do not understand all the
> intricacy of uuid the problem is compounded. IE How does one generate a
> UUID each c#, python, java etc? with the 47 random bits of bla bla. That is
> not super easy information to find. Maybe you find a stack overflow post
> that actually gives bad advice etc.
>
> Many times in Cassandra you are using a uuid because you do not have a
> unique key in the insert and you wish to create one. If you are inserting
> more then a single record using that same UUID and you do not want the
> burden of wanting to do it yourself you would have to do write>>read>>write
> which is an anti-pattern.
>


Re: Single cluster node restore

2016-12-02 Thread Anuj Wadehra
Hi Petr,
If data corruption means accidental data deletions via Cassandra commands, you 
have to restore entire cluster with latest snapshots. This may lead to data 
loss as there may be valid updates after the snapshot was taken but before the 
data deletion. Restoring single node with snapshot wont help as Cassandra 
replicated the accidental deletes to all nodes.
If data corruption means accidental deletion of some sstable files from file 
system of a node, repair would fix it.
If data corruption means unreadable data due to hardware issues etc, you will 
have two options after replacing the disk: bootstrap or restore snapshot on the 
single affected node. If you have huge data per node e.g. 300Gb , you may want 
to restore from Snapshot followed by repair. Restoring snapshot on single node 
is faster than streaming all data via bootstrap. If the node is not recoverable 
and must be replaced, you should be able to do auto-boostrap or restore from 
snapshot with auto-bootstrap set to false. I havent replaced a dead node with 
snapshot but there should not be any issues as token ranges dont change when 
you replace a node.



Thanks
Anuj 
 
  On Tue, 29 Nov, 2016 at 11:08 PM, Petr Malik wrote:   


Hi.

I have a question about Cassandra backup-restore strategies.


As far as I understand Cassandra has been designed to survive hardware failures 
by relying on data replication.




It seems like people still want backup/restore for case when somebody 
accidentally deletes data or the data gets otherwise corrupted.

In that case restoring all keyspace/table snapshots on all nodes should bring 
it back.




I am asking because I often read directions on restoring a single node in a 
cluster. I am just wondering under what circumstances could this be done safely.





Please correct me if i am wrong but restoring just a single node does not 
really roll back the data as the newer (corrupt) data will be served by other 
replicas and eventually propagated to the restored node. Right?

In fact by doing so one may end up reintroducing deleted data back...




Also since Cassandra distributes the data throughout the cluster it is not 
clear on which mode any particular (corrupt) data resides and hence which to 
restore.




I guess this is a long way of asking whether there is an advantage of trying to 
restore just a single node in a Cassandra cluster as opposed to say replacing 
the dead node and letting Cassandra handle the replication.




Thanks.