Re: Storing log structured data in Cassandra without compactions for performance boost.

2014-05-15 Thread Nate McCall
The following article has some good information for what you describe:
http://www.datastax.com/dev/blog/optimizations-around-cold-sstables

Some related tickets which will provide background:
https://issues.apache.org/jira/browse/CASSANDRA-5228
https://issues.apache.org/jira/browse/CASSANDRA-5515


On Tue, May 6, 2014 at 7:55 PM, Kevin Burton bur...@spinn3r.com wrote:

 I'm looking at storing log data in Cassandra…

 Every record is a unique timestamp for the key, and then the log line for
 the value.

 I think it would be best to just disable compactions.

 - there will never be any deletes.

 - all the data will be accessed in time range (probably partitioned
 randomly) and sequentially.

 So every time a memtable flushes, we will just keep that SSTable forever.

 Compacting the data is kind of redundant in this situation.

 I was thinking the best strategy is to use setcompactionthreshold and set
 the value VERY high to compactions are never triggered.

 Also, It would be IDEAL to be able to tell cassandra to just drop a full
 SSTable so that I can truncate older data without having to do a major
 compaction and without having to mark everything with a tombstone.  Is this
 possible?



 --

 Founder/CEO Spinn3r.com
 Location: *San Francisco, CA*
 Skype: *burtonator*
 blog: http://burtonator.wordpress.com
 … or check out my Google+ 
 profilehttps://plus.google.com/102718274791889610666/posts
 http://spinn3r.com
 War is peace. Freedom is slavery. Ignorance is strength. Corporations are
 people.




-- 
-
Nate McCall
Austin, TX
@zznate

Co-Founder  Sr. Technical Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com


Query returns incomplete result

2014-05-15 Thread Lu, Boying
Hi, All,

I use the astyanax 1.56.48 + Cassandra 2.0.6 in my test codes and do some query 
like this:

query = keyspace.prepareQuery(..).getKey(...)
.autoPaginate(true)
.withColumnRange(new RangeBuilder().setLimit(pageSize).build());

ColumnListIndexColumnName result;
result= query.execute().getResult();
while (!result.isEmpty()) {
//handle result here
result= query.execute().getResult();
}

There are 2003 records in the DB, if the pageSize is set to 1100, I get only 
2002 records back.
and if the pageSize is set to 3000, I can get the all 2003 records back.

Does anyone know why? Is it a bug?

Thanks

Boying



Setting the read/write consistency globaly in the CQL3 datastax java driver

2014-05-15 Thread Sebastian Schmidt
Hi,

I'm using the CQL3 Datastax Cassandra Java client. I want to use a
global read and write consistency for my queries. I know that I can set
the consistencyLevel for every single prepared statement. But I want to
do that just once per cluster or once per session. Is that possible?

Kind Regards,
Sebastian



signature.asc
Description: OpenPGP digital signature


Re: Disable reads during node rebuild

2014-05-15 Thread Aaron Morton
 As of 2.0.7, driftx has added this long-requested feature.
Thanks

A
-
Aaron Morton
New Zealand
@aaronmorton

Co-Founder  Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

On 13/05/2014, at 9:36 am, Robert Coli rc...@eventbrite.com wrote:

 On Mon, May 12, 2014 at 10:18 AM, Paulo Ricardo Motta Gomes 
 paulo.mo...@chaordicsystems.com wrote:
 Is there a way to disable reads from a node while performing rebuild from 
 another datacenter? I tried starting the node in write survery mode, but the 
 nodetool rebuild command does not work in this mode.
 
 As of 2.0.7, driftx has added this long-requested feature.
 
 https://issues.apache.org/jira/browse/CASSANDRA-6961
 
 Note that it is impossible to completely close the race window here as long 
 as writes are incoming, this functionality just dramatically shortens it.
 
 =Rob
  



Re: Question about READS in a multi DC environment.

2014-05-15 Thread graham sanderson
Yeah, but all the requests for data/digest are sent at the same time… responses 
that aren’t “needed” to complete the request are dealt with asynchronously 
(possibly causing repair). 

In the original trace (which is confusing because I don’t think the clocks are 
in sync)… I don’t see anything that makes me believe it is blocking for all 3 
responses - It actually does reads on all 3 nodes even if only digests are 
required

On May 12, 2014, at 12:37 AM, DuyHai Doan doanduy...@gmail.com wrote:

 Ins't read repair supposed to be done asynchronously in background ?
 
 
 On Mon, May 12, 2014 at 2:07 AM, graham sanderson gra...@vast.com wrote:
 You have a read_repair_chance of 1.0 which is probably why your query is 
 hitting all data centers.
 
 On May 11, 2014, at 3:44 PM, Mark Farnan devm...@petrolink.com wrote:
 
  Im trying to understand READ load in Cassandra across a multi-datacenter 
  cluster.   (Specifically why it seems to be hitting more than one DC) and 
  hope someone can help.
 
  From what Iím seeing here, a READ, with Consistency LOCAL_ONE,   seems to 
  be hitting All 3 datacenters, rather than just the one Iím connected to.   
  I see  'Read 101 live and 0 tombstoned cells'  from EACH of the 3 DCs in 
  the trace, which seems, wrong.
  I have tried every  Consistency level, same result.   This also is same 
  from my C# code via the DataStax driver, (where I first noticed the issue).
 
  Can someone please shed some light on what is occurring ?  Specifically I 
  dont' want a query on one DC, going anywhere near the other 2 as a rule, as 
  in production,  these DC's will be accross slower links.
 
 
  Query:  (NOTE:  Whilst this uses a kairosdb table,  i'm just playing with 
  queries against it as it has 100k columns in this key for testing).
 
  cqlsh:kairosdb consistency local_one
  Consistency level set to LOCAL_ONE.
 
  cqlsh:kairosdb select * from data_points where key = 
  0x6d61726c796e2e746573742e74656d7034000145b514a400726f6f6d3d6f6963653a
   limit 1000;
 
  ... Some return data  rows listed here which I've removed 
 
  CassandraQuery.txt
  Query Respose Trace:
 
  activity
   | timestamp
  | source | source_elapsed
  --+--++
  
execute_cql3_query | 07:18:12,692 
  | 192.168.25.111 |  0
  
 Message received from /192.168.25.111 | 07:18:00,706 
  | 192.168.25.131 | 50
  
   Executing single-partition query on data_points | 07:18:00,707 
  | 192.168.25.131 |760
  
  Acquiring sstable references | 07:18:00,707 
  | 192.168.25.131 |814
  
   Merging memtable tombstones | 07:18:00,707 
  | 192.168.25.131 |924
  
  Bloom filter allows skipping sstable 191 | 07:18:00,707 
  | 192.168.25.131 |   1050
  
  Bloom filter allows skipping sstable 190 | 07:18:00,707 
  | 192.168.25.131 |   1166
  
 Key cache hit for sstable 189 | 07:18:00,707 
  | 192.168.25.131 |   1275
  
   Seeking to partition beginning in data file | 07:18:00,707 
  | 192.168.25.131 |   1293
 Skipped 0/3 
  non-slice-intersecting sstables, included 0 due to tombstones | 
  07:18:00,708 | 192.168.25.131 |   2173
  
Merging data from memtables and 1 sstables | 07:18:00,708 
  | 192.168.25.131 |   2195
  
 Read 1001 live and 0 tombstoned cells | 07:18:00,709 
  | 192.168.25.131 |   3259
  
 Enqueuing response to /192.168.25.111 | 

Re: How long are expired values actually returned?

2014-05-15 Thread Aaron Morton
 Is this normal or am I doing something wrong?.
probably  this one. 

But the TTL is set based on the system clock on the server, first through would 
be to check the times are correct. 

If that fails, send over the schema and the insert. 

Cheers
Aaron

-
Aaron Morton
New Zealand
@aaronmorton

Co-Founder  Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

On 9/05/2014, at 2:44 am, Sebastian Schmidt isib...@gmail.com wrote:

 Hi,
 
 I'm using the TTL feature for my application. In my tests, when using a
 TTL of 5, the inserted rows are still returned after 7 seconds, and
 after 70 seconds. Is this normal or am I doing something wrong?.
 
 Kind Regards,
 Sebastian
 



Re: Disable reads during node rebuild

2014-05-15 Thread sankalp kohli
This might be useful
Nodetool command to disable
readshttps://issues.apache.org/jira/browse/CASSANDRA-6760


On Wed, May 14, 2014 at 8:31 AM, Paulo Ricardo Motta Gomes 
paulo.mo...@chaordicsystems.com wrote:

 That's a nice workaround, will be really helpful in emergency situations
 like this.

 Thanks,


 On Mon, May 12, 2014 at 6:58 PM, Aaron Morton aa...@thelastpickle.comwrote:

 I'm not able to replace a dead node using the ordinary procedure
 (boostrap+join), and would like to rebuild the replacement node from
 another DC.

 Normally when you want to add a new DC to the cluster the command to use
 is nodetool rebuild $DC_NAME .(with auto_bootstrap: false) That will get
 the node to stream data from the $DC_NAME

 The problem is that if I start a node with auto_bootstrap=false to
 perform the rebuild, it automatically starts serving empty reads
 (CL=LOCAL_ONE).

  When adding a new DC the nodes wont be processing reads, that is not
 the case for you.

 You should disable the client API’s to prevent the clients from calling
 the new nodes, use -Dcassandra.start_rpc=false and
 -Dcassandra.start_native_transport=false in cassandra-env.sh or appropriate
 settings in cassandra.yaml

 Disabling reads from other nodes will be harder. IIRC during bootstrap a
 different timeout (based on ring_delay) is used to detect if the
 bootstrapping node is down. However if the node is running and you use
 nodetool rebuild i’m pretty sure the normal gossip failure detectors will
 kick in. Which means you cannot disable gossip to prevent reads. Also we
 would want the node to be up for writes.

 But what you can do is artificially set the severity of the node high so
 the dynamic snitch will route around it. See
 https://github.com/apache/cassandra/blob/cassandra-2.0/src/java/org/apache/cassandra/locator/DynamicEndpointSnitchMBean.java#L37


 * Set the value to something high on the node you will be rebuilding, the
 number or cores on the system should do.  (jmxterm is handy for this
 http://wiki.cyclopsgroup.org/jmxterm)
 * Check nodetool gossipinfo on the other nodes to see the SEVERITY app
 state has propagated.
 * Watch completed ReadStage tasks on the node you want to rebuild. If you
 have read repair enabled it will still get some traffic.
 * Do rebuild
 * Reset severity to 0

 Hope that helps.
 Aaron

 -
 Aaron Morton
 New Zealand
 @aaronmorton

 Co-Founder  Principal Consultant
 Apache Cassandra Consulting
 http://www.thelastpickle.com

 On 13/05/2014, at 5:18 am, Paulo Ricardo Motta Gomes 
 paulo.mo...@chaordicsystems.com wrote:

 Hello,

 I'm not able to replace a dead node using the ordinary procedure
 (boostrap+join), and would like to rebuild the replacement node from
 another DC. The problem is that if I start a node with auto_bootstrap=false
 to perform the rebuild, it automatically starts serving empty reads
 (CL=LOCAL_ONE).

 Is there a way to disable reads from a node while performing rebuild from
 another datacenter? I tried starting the node in write survery mode, but
 the nodetool rebuild command does not work in this mode.

 Thanks,

 --
 *Paulo Motta*

 Chaordic | *Platform*
 *www.chaordic.com.br http://www.chaordic.com.br/*
 +55 48 3232.3200





 --
 *Paulo Motta*

 Chaordic | *Platform*
 *www.chaordic.com.br http://www.chaordic.com.br/*
 +55 48 3232.3200



Re: Effect of number of keyspaces on write-throughput....

2014-05-15 Thread Krishna Chaitanya
Hello,
Thanks for the reply. Currently, each client is writing about 470 packets
per second where each packet is 1500 bytes. I have four clients writing
simultaneously to the cluster. Each client is writing to a separate
keyspace simultaneously. Hence, is there a lot of switching of keyspaces?

The total throughput is coming to around 1900 packets per second
when using multiple keyspaces. This is because there are 4 clients and each
one is writing around 470 pkts/sec. But, I observed that when using a
single keyspace, the write throughout reduced slightly to 1800pkts/sec
while I actually expected it to increase since there is no switching of
contexts now. Why is this so?  470 packets is the maximum I can write from
each client currently, since it is the limitation of my client program.
I should also mention that these tests are being run on a
single and double node clusters with all  the write requests going only to
a single cassandra server.

 Can you also kindly explain how factors like using a single
v/s multiple keyspaces, distributing write requests to a single cassandra
node v/s multiple cassandra nodes, etc. affect the write throughput?  Are
there any other factors that affect write throughput other than these?
Because, a single cassandra node seems to be able to handle all these write
requests as I am not able to see any significant improvement by
distributing write requests among multiple nodes.

Thanking you.

On May 12, 2014 2:39 PM, Aaron Morton aa...@thelastpickle.com wrote:

 On the homepage of libQtCassandra, its mentioned that switching between
 keyspaces is costly when storing into Cassandra thereby affecting the write
 throughput. Is this necessarily true for other libraries like pycassa and
 hector as well?

 When using the thrift connection the keyspace is a part of the connection
 state, so changing keyspaces requires a round trip to the server. Not
 hugely expensive, but it adds up if you do it a lot.

 Can I increase the write throughput by configuring all the
 clients to store in a single keyspace instead of multiple keyspaces to
 increase the write throughput?

 You should expect to get 3,000 to 4,000 writes per core per node.

 What are you getting now?

 Cheers
 A

 -
 Aaron Morton
 New Zealand
 @aaronmorton

 Co-Founder  Principal Consultant
 Apache Cassandra Consulting
 http://www.thelastpickle.com

 On 11/05/2014, at 4:06 pm, Krishna Chaitanya bnsk1990r...@gmail.com
 wrote:

 Hello,
 I have an application that writes network packets to a Cassandra cluster
 from a number of client nodes. It uses the libQtCassandra library to access
 Cassandra. On the homepage of libQtCassandra, its mentioned that switching
 between keyspaces is costly when storing into Cassandra thereby affecting
 the write throughput. Is this necessarily true for other libraries like
 pycassa and hector as well?
 Can I increase the write throughput by configuring all the
 clients to store in a single keyspace instead of multiple keyspaces to
 increase the write throughput?

 Thankyou.





Mutation messages dropped

2014-05-15 Thread Raveendran, Varsha IN BLR STS
Hello,

I am writing around 10Million records continuously into a single node Cassandra 
(2.0.5) .
In the Cassandra log file I see an entry 272 MUTATION messages dropped in last 
5000ms . Does this mean that 272 records were not written successfully?

Thanks,
Varsha



Re: How to balance this cluster out ?

2014-05-15 Thread Aaron Morton
This is not a problem with the token assignments. Here is the ideal assignments 
from the tools/bin/token-generator script 

DC #1:
  Node #1:0
  Node #2:   56713727820156410577229101238628035242
  Node #3:  113427455640312821154458202477256070484

You are pretty close, but the order of the nodes in the output is a little odd, 
would normally expect node 2 to be first.

First step would be to check the logs on 1 to see if it’s failing at 
compaction, and to check if it’s holding a lot of hints. Then make sure repair 
is running so the data is distributed. 

Hope that helps. 
Aaron

-
Aaron Morton
New Zealand
@aaronmorton

Co-Founder  Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

On 12/05/2014, at 11:58 pm, Oleg Dulin oleg.du...@gmail.com wrote:

 I have a cluster that looks like this:
 
 Datacenter: us-east
 ==
 Replicas: 2
 
 Address RackStatus State   LoadOwns   
 Token
 
 113427455640312821154458202477256070484
 *.*.*.1   1b  Up Normal  141.88 GB   66.67% 
 56713727820156410577229101238628035242
 *.*.*.2  1a  Up Normal  113.2 GB66.67%  210
 *.*.*.3   1d  Up Normal  102.37 GB   66.67% 
 113427455640312821154458202477256070484
 
 
 Obviously, the first node in 1b has 40% more data than the others. If I 
 wanted to rebalance this cluster, how would I go about that ? Would shifting 
 the tokens accomplish what I need and which tokens ?
 
 Regards,
 Oleg
 
 



Cassandra token range support for Hadoop (ColumnFamilyInputFormat)

2014-05-15 Thread Anton Brazhnyk
Greetings,

I'm reading data from C* with Spark (via ColumnFamilyInputFormat) and I'd like 
to read just part of it - something like Spark's sample() function.
Cassandra's API seems allow to do it with its 
ConfigHelper.setInputRange(jobConfiguration, startToken, endToken) method, but 
it doesn't work.
The limit is just ignored and the entire column family is scanned. It seems 
this kind of feature is just not supported 
and sources of AbstractColumnFamilyInputFormat.getSplits confirm that (IMO).
Questions:
1. Am I right that there is no way to get some data limited by token range with 
ColumnFamilyInputFormat?
2. Is there other way to limit the amount of data read from Cassandra with 
Spark and ColumnFamilyInputFormat,
so that this amount is predictable (like 5% of entire dataset)?


WBR,
Anton




Re: Automatic tombstone removal issue (STCS)

2014-05-15 Thread Paulo Ricardo Motta Gomes
I just updated CASSANDRA-6563 with more details and proposed a patch to
solve the issue, in case anyone else is interested.

https://issues.apache.org/jira/browse/CASSANDRA-6563

On Tue, May 6, 2014 at 10:00 PM, Paulo Ricardo Motta Gomes 
paulo.mo...@chaordicsystems.com wrote:

 Robert: thanks for the support, you are right, this belonged more to the
 dev list but I didn't think of it.

 Yuki: thanks a lot for the clarification, this is what I suspected.

 I understand it's costly to check row by row overlap in order to decide if
 a SSTable is candidate for compaction, but doesn't the compaction process
 already performs this check when removing tombstones? So, couldn't this
 check be dropped during decision time and let the compaction run anyway?

 This optimization is specially interesting with large STCS sstables, where
 the token range will very likely overlap with all other sstables, so it's a
 pity it's almost never being triggered in these cases.

 On Tue, May 6, 2014 at 9:32 PM, Yuki Morishita mor.y...@gmail.com wrote:

 Hi Paulo,

 The reason we check overlap is not to resurrect deleted data by only
 dropping tombstone marker from single SSTable.
 And we don't want to check row by row to determine if SSTable is
 droppable since it takes time, so we use token ranges to determine if
 it MAY have droppable columns.

 On Tue, May 6, 2014 at 7:14 PM, Paulo Ricardo Motta Gomes
 paulo.mo...@chaordicsystems.com wrote:
  Hello,
 
  Sorry for being persistent, but I'd love to clear my understanding on
 this.
  Has anyone seen single sstable compaction being triggered for STCS
 sstables
  with high tombstone ratio?
 
  Because if the above understanding is correct, the current
 implementation
  almost never triggers this kind of compaction, since the token ranges
 of a
  node's sstable almost always overlap. Could this be a bug or is it
 expected
  behavior?
 
  Thank you,
 
 
 
  On Mon, May 5, 2014 at 8:59 AM, Paulo Ricardo Motta Gomes
  paulo.mo...@chaordicsystems.com wrote:
 
  Hello,
 
  After noticing that automatic tombstone removal (CASSANDRA-3442) was
 not
  working in an append-only STCS CF with 40% of droppable tombstone
 ratio I
  investigated why the compaction was not being triggered in the largest
  SSTable with 16GB and about 70% droppable tombstone ratio.
 
  When the code goes to check if the SSTable is candidate to be compacted
  (AbstractCompactionStrategy.worthDroppingTombstones), it verifies if
 all the
  others SSTables overlap with the current SSTable by checking if the
 start
  and end tokens overlap. The problem is that all SSTables contain
 pretty much
  the whole node token range, so all of them overlap nearly all the
 time, so
  the automatic tombstone removal never happens. Is there any case in
 STCS
  where all sstables token ranges DO NOT overlap?
 
  I understand during the tombstone removal process it's necessary to
 verify
  if the compacted row exists in any other SSTable, but I don't
 understand why
  it's necessary to verify if the token ranges overlap to decide if a
  tombstone compaction must be executed on a single SSTable with high
  droppable tombstone ratio.
 
  Any clarification would be kindly appreciated.
 
  PS: Cassandra version: 1.2.16
 
  --
  Paulo Motta
 
  Chaordic | Platform
  www.chaordic.com.br
  +55 48 3232.3200
 
 
 
 
  --
  Paulo Motta
 
  Chaordic | Platform
  www.chaordic.com.br
  +55 48 3232.3200



 --
 Yuki Morishita
  t:yukim (http://twitter.com/yukim)




 --
 *Paulo Motta*

 Chaordic | *Platform*
 *www.chaordic.com.br http://www.chaordic.com.br/*
 +55 48 3232.3200




-- 
*Paulo Motta*

Chaordic | *Platform*
*www.chaordic.com.br http://www.chaordic.com.br/*
+55 48 3232.3200


Re: Efficient bulk range deletions without compactions by dropping SSTables.

2014-05-15 Thread Kevin Burton


 We basically do this same thing in one of our production clusters, but
 rather than dropping SSTables, we drop Column Families. We time-bucket our
 CFs, and when a CF has passed some time threshold (metadata or embedded in
 CF name), it is dropped. This means there is a home-grown system that is
 doing the bookkeeping/maintenance rather than relying on C*s inner
 workings. It is unfortunate that we have to maintain a system which
 maintains CFs, but we've been in a pretty good state for the last 12 months
 using this method.


Yup.  This is exactly what we do for MySQL but it's kind of a shame to do
it with Cassandra.  The SSTable system can do it directly.  I had been
working on a bigtable implementation (which I put on hold for now) that
supported this feature.   Since cassandra can do it directly … it seems a
shame that it's not.

Also, this means you generally have to build a duplicate CQL layer on top
of CQL.  For example, what happens if a range query is over a time range
between tables, you have to scan both.

And if you're doing looks by key then you have to also scan all the
temporal column families but this is exactly what cassandra does with
SSTables.


 Some caveats:

 By default, C* makes snapshots of your data when a table is dropped. You
 can leave that and have something else clear up the snapshots, or if you're
 less paranoid, set auto_snapshot: false in the cassandra.yaml file.

 Cassandra does not handle 'quick' schema changes very well, and we found
 that only one node should be used for these changes. When adding or
 removing column families, we have a single, property defined C* node that
 is designated as the schema node. After making a schema change, we had to
 throw in an artificial delay to ensure that the schema change propagated
 through the cluster before making the next schema change. And of course,
 relying on a single node being up for schema changes is less than ideal, so
 handling fail over to a new node is important.

 The final, and hardest problem, is that C* can't really handle schema
 changes while a node is being bootstrapped (new nodes, replacing a dead
 node). If a column family is dropped, but the new node has not yet received
 that data from its replica, the node will fail to bootstrap when it finally
 begins to receive that data - there is no column family for the data to be
 written to, so that node will be stuck in the joining state, and it's
 system keyspace needs to be wiped and re-synced to attempt to get back to a
 happy state. This unfortunately means we have to stop schema changes when a
 node needs to be replaced, but we have this flow down pretty well.


Nice.  This was excellent feedback.

Thanks!
-- 

Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
Skype: *burtonator*
blog: http://burtonator.wordpress.com
… or check out my Google+
profilehttps://plus.google.com/102718274791889610666/posts
http://spinn3r.com
War is peace. Freedom is slavery. Ignorance is strength. Corporations are
people.


What % of cassandra developers are employed by Datastax?

2014-05-15 Thread Kevin Burton
I'm curious what % of cassandra developers are employed by Datastax?

… vs other companies.

When MySQL was acquired by Oracle this became a big issue because even
though you can't really buy an Open Source project, you can acquire all the
developers and essentially do the same thing.

It would be sad if all of Cassandra's 'eggs' were in one basket and a
similar situation happens with Datastax.

Seems like they're doing an awesome job to be sure but I guess it worries
me in the back of my mind.



-- 

Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
Skype: *burtonator*
blog: http://burtonator.wordpress.com
… or check out my Google+
profilehttps://plus.google.com/102718274791889610666/posts
http://spinn3r.com
War is peace. Freedom is slavery. Ignorance is strength. Corporations are
people.


Stephanie Huynh invites you to Bay Area Big Data

2014-05-15 Thread Stephanie Huynh
Hi there,

Stephanie Huynh has invited you to Bay Area Big Data.

Stephanie Huynh says:

Please join the Bay Area Big Data group to learn about the BigData and NoSQL 
landscape!


To find out more and join, click here:
http://www.meetup.com/Bay-Area-Big-Data/t/if_42202442/?gj=ej4


If you're not interested, there's no need to do anything. Meetup will not keep 
your address on any list.


Failed to mkdirs $HOME/.cassandra

2014-05-15 Thread Bryan Talbot
How should nodetool command be run as the user nobody?

The nodetool command fails with an exception if it cannot create a
.cassandra directory in the current user's home directory.

I'd like to schedule some nodetool commands to run with least privilege as
cron jobs. I'd like to run them as the nobody user -- which typically has
/ as the home directory -- since that's what the user is typically used
for (minimum privileges).

None of the methods described in this JIRA actually seem to work (with
2.0.7 anyway) https://issues.apache.org/jira/browse/CASSANDRA-6475

Testing as a normal user with no write permissions to the home directory
(to simulate the nobody user)

[vagrant@local-dev ~]$ nodetool version
ReleaseVersion: 2.0.7
[vagrant@local-dev ~]$ rm -rf .cassandra/
[vagrant@local-dev ~]$ chmod a-w .

[vagrant@local-dev ~]$ nodetool flush my_ks my_cf
Exception in thread main FSWriteError in /home/vagrant/.cassandra
at
org.apache.cassandra.io.util.FileUtils.createDirectory(FileUtils.java:305)
at
org.apache.cassandra.utils.FBUtilities.getToolsOutputDirectory(FBUtilities.java:690)
at
org.apache.cassandra.tools.NodeCmd.printHistory(NodeCmd.java:1504)
at org.apache.cassandra.tools.NodeCmd.main(NodeCmd.java:1204)
Caused by: java.io.IOException: Failed to mkdirs /home/vagrant/.cassandra
... 4 more

[vagrant@local-dev ~]$ HOME=/tmp nodetool flush my_ks my_cf
Exception in thread main FSWriteError in /home/vagrant/.cassandra
at
org.apache.cassandra.io.util.FileUtils.createDirectory(FileUtils.java:305)
at
org.apache.cassandra.utils.FBUtilities.getToolsOutputDirectory(FBUtilities.java:690)
at
org.apache.cassandra.tools.NodeCmd.printHistory(NodeCmd.java:1504)
at org.apache.cassandra.tools.NodeCmd.main(NodeCmd.java:1204)
Caused by: java.io.IOException: Failed to mkdirs /home/vagrant/.cassandra
... 4 more

[vagrant@local-dev ~]$ env HOME=/tmp nodetool flush my_ks my_cf
Exception in thread main FSWriteError in /home/vagrant/.cassandra
at
org.apache.cassandra.io.util.FileUtils.createDirectory(FileUtils.java:305)
at
org.apache.cassandra.utils.FBUtilities.getToolsOutputDirectory(FBUtilities.java:690)
at
org.apache.cassandra.tools.NodeCmd.printHistory(NodeCmd.java:1504)
at org.apache.cassandra.tools.NodeCmd.main(NodeCmd.java:1204)
Caused by: java.io.IOException: Failed to mkdirs /home/vagrant/.cassandra
... 4 more

[vagrant@local-dev ~]$ env user.home=/tmp nodetool flush my_ks my_cf
Exception in thread main FSWriteError in /home/vagrant/.cassandra
at
org.apache.cassandra.io.util.FileUtils.createDirectory(FileUtils.java:305)
at
org.apache.cassandra.utils.FBUtilities.getToolsOutputDirectory(FBUtilities.java:690)
at
org.apache.cassandra.tools.NodeCmd.printHistory(NodeCmd.java:1504)
at org.apache.cassandra.tools.NodeCmd.main(NodeCmd.java:1204)
Caused by: java.io.IOException: Failed to mkdirs /home/vagrant/.cassandra
... 4 more

[vagrant@local-dev ~]$ nodetool -Duser.home=/tmp flush my_ks my_cf
Unrecognized option: -Duser.home=/tmp
usage: java org.apache.cassandra.tools.NodeCmd --host arg command
...


Re: Cassandra 2.0.7 always failes due to 'too may open files' error

2014-05-15 Thread Nikolay Mihaylov
sorry, probably somebody mentioned it, but did you checked global limit?

cat /proc/sys/fs/file-max
cat /proc/sys/fs/file-nr



On Mon, May 5, 2014 at 10:31 PM, Bryan Talbot bryan.tal...@playnext.comwrote:

 Running

 # cat /proc/$(cat /var/run/cassandra.pid)/limits

 as root or your cassandra user will tell you what limits it's actually
 running with.




 On Sun, May 4, 2014 at 10:12 PM, Yatong Zhang bluefl...@gmail.com wrote:

 I am running 'repair' when the error occurred. And just a few days before
 I changed the compaction strategy to 'leveled'. don know if this helps


 On Mon, May 5, 2014 at 1:10 PM, Yatong Zhang bluefl...@gmail.com wrote:

 Cassandra is running as root

 [root@storage5 ~]# ps aux | grep java
 root  1893 42.0 24.0 7630664 3904000 ? Sl   10:43  60:01 java
 -ea -javaagent:/mydb/cassandra/bin/../lib/jamm-0.2.5.jar
 -XX:+CMSClassUnloadingEnabled -XX:+UseThreadPriorities
 -XX:ThreadPriorityPolicy=42 -Xms3959M -Xmx3959M -Xmn400M
 -XX:+HeapDumpOnOutOfMemoryError -Xss256k -XX:StringTableSize=103
 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled
 -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1
 -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly
 -XX:+UseTLAB -XX:+UseCondCardMark -Djava.net.preferIPv4Stack=true
 -Dcom.sun.management.jmxremote.port=7199
 -Dcom.sun.management.jmxremote.ssl=false
 -Dcom.sun.management.jmxremote.authenticate=false
 -Dlog4j.configuration=log4j-server.properties
 -Dlog4j.defaultInitOverride=true -Dcassandra-pidfile=/var/run/cassandra.pid
 -cp
 /mydb/cassandra/bin/../conf:/mydb/cassandra/bin/../build/classes/main:/mydb/cassandra/bin/../build/classes/thrift:/mydb/cassandra/bin/../lib/antlr-3.2.jar:/mydb/cassandra/bin/../lib/apache-cassandra-2.0.7.jar:/mydb/cassandra/bin/../lib/apache-cassandra-clientutil-2.0.7.jar:/mydb/cassandra/bin/../lib/apache-cassandra-thrift-2.0.7.jar:/mydb/cassandra/bin/../lib/commons-cli-1.1.jar:/mydb/cassandra/bin/../lib/commons-codec-1.2.jar:/mydb/cassandra/bin/../lib/commons-lang3-3.1.jar:/mydb/cassandra/bin/../lib/compress-lzf-0.8.4.jar:/mydb/cassandra/bin/../lib/concurrentlinkedhashmap-lru-1.3.jar:/mydb/cassandra/bin/../lib/disruptor-3.0.1.jar:/mydb/cassandra/bin/../lib/guava-15.0.jar:/mydb/cassandra/bin/../lib/high-scale-lib-1.1.2.jar:/mydb/cassandra/bin/../lib/jackson-core-asl-1.9.2.jar:/mydb/cassandra/bin/../lib/jackson-mapper-asl-1.9.2.jar:/mydb/cassandra/bin/../lib/jamm-0.2.5.jar:/mydb/cassandra/bin/../lib/jbcrypt-0.3m.jar:/mydb/cassandra/bin/../lib/jline-1.0.jar:/mydb/cassandra/bin/../lib/json-simple-1.1.jar:/mydb/cassandra/bin/../lib/libthrift-0.9.1.jar:/mydb/cassandra/bin/../lib/log4j-1.2.16.jar:/mydb/cassandra/bin/../lib/lz4-1.2.0.jar:/mydb/cassandra/bin/../lib/metrics-core-2.2.0.jar:/mydb/cassandra/bin/../lib/netty-3.6.6.Final.jar:/mydb/cassandra/bin/../lib/reporter-config-2.1.0.jar:/mydb/cassandra/bin/../lib/servlet-api-2.5-20081211.jar:/mydb/cassandra/bin/../lib/slf4j-api-1.7.2.jar:/mydb/cassandra/bin/../lib/slf4j-log4j12-1.7.2.jar:/mydb/cassandra/bin/../lib/snakeyaml-1.11.jar:/mydb/cassandra/bin/../lib/snappy-java-1.0.5.jar:/mydb/cassandra/bin/../lib/snaptree-0.1.jar:/mydb/cassandra/bin/../lib/super-csv-2.1.0.jar:/mydb/cassandra/bin/../lib/thrift-server-0.3.3.jar
 org.apache.cassandra.service.CassandraDaemon




 On Mon, May 5, 2014 at 1:02 PM, Philip Persad 
 philip.per...@gmail.comwrote:

 Have you tried running ulimit -a as the Cassandra user instead of as
 root? It is possible that your configured a high file limit for root but
 not for the user running the Cassandra process.


 On Sun, May 4, 2014 at 6:07 PM, Yatong Zhang bluefl...@gmail.comwrote:

 [root@storage5 ~]# lsof -n | grep java | wc -l
 5103
 [root@storage5 ~]# lsof | wc -l
 6567


 It's mentioned in previous mail:)


 On Mon, May 5, 2014 at 9:03 AM, nash nas...@gmail.com wrote:

 The lsof command or /proc can tell you how many open files it has.
 How many is it?

 --nash