batchstatement

2018-07-13 Thread Randy Lynn
TL/DR:
- only 1 out of 14 statements in a batch do not mutate the partition..
- no error is logged in the application layer, or Cassandra system.log or
Cassandra debug.log
- using C#, and Datastax latest driver
- cluster is a 1-node, dev setup.
- datastax driver configured with LOCAL_QUORUM at the session, and
statement level.
- using preparedstatements.. 1,000% sure there's no typo.. (but I've been
wrong before)


I have about 14 statements that get batched up together. They're updating
at most 2, maybe 3 denormalized tables.. all the same user object, just
different lookup keys.

To help visualize, the tables look a little like these.. abbreviated..
User table..
CREATE TABLE user (u_id uuid, act_id uuid, ext_id text, dt_created
timeuuid, dt_mod timeuuid, is_group Boolean, first_name text)

Users By Account (or plan)
CREATE TABLE user_by_act (act_id uuid, u_id uuid, first_name text)

User By external identifier
CREATE TABLE user_by_ext (ext_id text, u_id uuid, act_id uuid, first_name
text)

I create a batch that updates all the tables.. various updates are broken
out into separate statements, so for example, there's a statement that
updates the external ID in the 'user' table.

UPDATE user_by_ext SET ext_id = :ext_id WHERE u_id = :u_id

This particular batch has 14 statements total, across all 3 tables. They
are only updating at most 3 partitions.. a single partition may have 4 or
more statements to update various parts of the partition. e.g. first name
and last name are a single statement added to the batch.

Here's the problem... of those 14 statements.. across the 3 partitions...
ONE and ONLY ONE update doesn't work.. Absolutely every other discreet
update in the whole batch works.

List boundStatements = new
List();

// *
// user table
boundStatements.Add(SessionManager.UserInsertStatement.Bind(new
{ u_id= user.UserId, act_id = user.ActId, dt_created = nowId, dt_mod =
nowId, is_everyone = user.IsEveryone, is_group = user.IsGroup }));

if (!string.IsNullOrWhiteSpace(user.ExtId))
//
// this statement gets added to the list.. it is part of the batch
// but it NEVER updates the actual field in the databse.
// I have moved it around.. up, down... the only thing that works
// is if I call execute on the first binding above, and then add the rest
// of these as a separate batch.

boundStatements.Add(SessionManager.UserUpdateExtIdStatement.Bind(new { u_id
= user.UserId, ext_id = user.ExtId, dt_mod = nowId }));
//


if (!string.IsNullOrWhiteSpace(user.Email))

boundStatements.Add(SessionManager.UserUpdateEmailStatement.Bind(new { u_id
= user.UserId, email = user.Email, dt_mod = nowId }));
BoundStatement userProfile =
CreateUserProfileBoundStatement(nowId, user);
if (userProfile != null)
boundStatements.Add(userProfile);
// *
// user_by_act table
CreateUserAccountInsertBoundStatements(boundStatements, user,
nowId);
// *
// user_by_ext table
if (!string.IsNullOrWhiteSpace(user.ExtId))
{

boundStatements.Add(SessionManager.UserExtInsertStatement.Bind(new { ext_id
= user.ExtId, act_id = user.ActId, dt_created = nowId, dt_mod = nowId,
is_group = user.IsGroup, u_id = user.UserId }));
BoundStatement userByExtProfile =
CreateUserByExtProfileBoundStatement(nowId, user);
if (userByExtProfile != null)
boundStatements.Add(userByExtProfile);
if (!string.IsNullOrWhiteSpace(user.Email))

boundStatements.Add(SessionManager.UserExtUpdateEmailStatement.Bind(new {
ext_id = user.ExtId, email = user.Email, dt_mod = nowId }));
}



-- 
Randy Lynn
rl...@getavail.com

office:
859.963.1616 <+1-859-963-1616> ext 202
163 East Main Street - Lexington, KY 40507 - USA

 getavail.com 


Re: Best approach for node decommission

2018-07-13 Thread Soumya Jena
If it helps you can use 'nodetool netstats' to monitor . If you do not see
any streams stuck there , then you should be good . Also if you are
concerned , you can run "nodetool repair" after the decommission just to
make sure .

I think what you are doing is okay . If the node is alive then  nodetool
decommission is best approach  . If the node is dead then you have to use
nodetool removenode 

On Thu, Jul 12, 2018 at 10:45 AM, rajasekhar kommineni 
wrote:

> Hi All,
>
> Can anybody let me know best approach for decommissiong a node in the
> cluster. My cluster is using vnodes, is there any way to verify all the
> data of the decommissioning node has been moved to remaining nodes, before
> completely shutting down the server.
>
> I followed below procedure :
>
> 1) nodetool flush
> 2) nodetool repair
> 3) nodetool decommission
>
> The aggregate of Load before node3 decommission is 1411.47 and after is
> 1380.15. Can I ignore the size difference and treat all the data of node3
> has been moved to other nodes.
>
> I am looking for good data validation process with out depending on
> Application team for verification.
>
> *Total load : 1411.47*
>
> – Address Load Tokens Owns Host ID Rack
> UN node1 220.48 MiB 256 ? ff09b08b-29c1-4365-a3b7-1eea51f7d575 rack1
> UN node2 216.53 MiB 256 ? 4b565a31-4c77-418f-a47f-5e0eb2ec5624 rack1
> UN node3 64.52  MiB 256 ? 12b29812-cc60-456c-95a9-0e339c249bc8 rack1
> UN node4 195.84 MiB 256 ? 0424a882-de4f-4e6a-b642-6ce9f4621e04 rack1
> UN node5 179.07 MiB 256 ? 2f291a2e-b10d-4364-8192-13e107a9c322 rack1
> UN node6 213.75 MiB 256 ? cf10166b-cfae-44fd-8bca-f55a4f9ef491 rack1
> UN node7 158.54 MiB 256 ? ef8454c7-3005-487a-a3d4-e0065edfd99f rack1
> UN node8 162.74 MiB 256 ? 7d786e46-1c11-485c-a943-bbcca6729ae1 rack1
>
> *Total Load : 1380.15*
>
> – Address Load Tokens Owns Host ID Rack
> UN node1 229.04 MiB 256 ? ff09b08b-29c1-4365-a3b7-1eea51f7d575 rack1
> UN node2 225.52 MiB 256 ? 4b565a31-4c77-418f-a47f-5e0eb2ec5624 rack1
> UN node4 195.84 MiB 256 ? 0424a882-de4f-4e6a-b642-6ce9f4621e04 rack1
> UN node5 179.07 MiB 256 ? 2f291a2e-b10d-4364-8192-13e107a9c322 rack1
> UN node6 229.4  MiB 256 ? cf10166b-cfae-44fd-8bca-f55a4f9ef491 rack1
> UN node7 158.54 MiB 256 ? ef8454c7-3005-487a-a3d4-e0065edfd99f rack1
> UN node8 162.74 MiB 256 ? 7d786e46-1c11-485c-a943-bbcca6729ae1 rack1
>
> Thanks,
>
>


Re: cassandra cluser sizing

2018-07-13 Thread Vitaliy Semochkin
Jeff, thank you very much for reply.
Will try to use 4TB per instance.

If I understand it correctly level compaction can lead to 50%
https://docs.datastax.com/en/dse-planning/doc/planning/planningHardware.html

Regarding the question of running multiple instances per server, am I
correct that in case of 3.11 instances and having several disks
dedicated for each instance, running multiple instances per server is
ok?


On Thu, Jul 12, 2018 at 5:47 PM Jeff Jirsa  wrote:
>
> You can certainly go higher than a terabyte - 4 or so is common, Ive heard of 
> people doing up to 12 tb with the awareness that time to replace scales with 
> size on disk, so a very large host will take longer to rebuild than a small 
> host
>
> The 50% free guidance only applies to size tiered compaction, and given your 
> throughput you may prefer leveled compaction anyway. With leveled you should 
> target 30% free for compaction and repair
>
> You don’t need more than one Cassandra instance per host for 4tb but you may 
> want to consider it for more than that - multiple instances are especially 
> useful if you have multiple (lots of) disks and are running Cassandra before 
> CASSANDRA-6696 (which made jbod safer).
>
> --
> Jeff Jirsa
>
>
> > On Jul 12, 2018, at 7:37 AM, Vitaliy Semochkin  wrote:
> >
> > Hi,
> >
> > Which amount of data Cassandra 3 server in a cluster can serve at max?
> > The documentation says it is only 1TB.
> > If the load is not high (only about 100 requests per second with 1kb
> > of data each) is it safe to go above 1TB size (let's say 5TB per
> > server)?
> > What will be safe maximum disk size a server in such cluster can serve?
> >
> > Documentation also says that  compaction  requires to have %50 of disk
> > occupied space. In case I don't have update operations (only insert)
> > do I need that much extra space for compaction?
> >
> > In articles (outside Datastax docs) I read that it is a common
> > practice to launch more than one Cassandra server on one physical
> > server in order to be able use more than 1TB of hard driver per
> > server, is it recommended?
> >
> > -
> > To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: user-h...@cassandra.apache.org
> >
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org