Re: Too many open files

2018-01-22 Thread Nikolay Mihaylov
You can increase system open files, also if you compact, open files will go down. On Mon, Jan 22, 2018 at 10:19 AM, Dor Laor wrote: > It's a high number, your compaction may run behind and thus > many small sstables exist. However, you're also taking the > number of network connection in the cal

Re: Production with Single Node

2016-01-27 Thread Nikolay Mihaylov
HI We have 2 - 3 installations with single node Cassandra. They working fine, no problems there, except if Cassandra stops, everything stops. Even on one node, we usually "rolling" 500-600 GB data, sometimes even 2-3 TB. We use mostly standard configuration with almost no changes there. Here are

sstable structure

2015-01-02 Thread Nikolay Mihaylov
Hi from some time I try to find the structure of sstable is it documented somewhere or can anyone explain it to me I am speaking about "hex dump" bytes stored on the disk. Nick.

Re: Tombstones without DELETE

2015-01-02 Thread Nikolay Mihaylov
Hi Tyler, sorry for very stupid question - what is a collection ? Nick On Wed, Dec 31, 2014 at 6:27 PM, Tyler Hobbs wrote: > Overwriting an entire collection also results in a tombstone being > inserted. > > On Wed, Dec 24, 2014 at 7:09 AM, Ryan Svihla wrote: > >> You should probably ask on t

Re: disk space issue

2014-09-30 Thread Nikolay Mihaylov
my 2 cents: try major compaction on the column family with TTL's - for sure will be faster than full rebuild. also try not cassandra related things, such check and remove old log files, backups etc. On Wed, Oct 1, 2014 at 9:34 AM, Sumod Pawgi wrote: > In the past in such scenarios it has helpe

Re: Cassandra 2.0.7 always failes due to 'too may open files' error

2014-05-15 Thread Nikolay Mihaylov
sorry, probably somebody mentioned it, but did you checked global limit? cat /proc/sys/fs/file-max cat /proc/sys/fs/file-nr On Mon, May 5, 2014 at 10:31 PM, Bryan Talbot wrote: > Running > > #> cat /proc/$(cat /var/run/cassandra.pid)/limits > > as root or your cassandra user will tell you what

Re: Avoiding email duplicates when registering users

2014-05-13 Thread Nikolay Mihaylov
the real question is - if you want the email to be unique, why use "surrogate" primary key as UUID. I wonder what UUID gives you at all? If you want to have non email primary key, why not use md5(email) ? On Wed, May 7, 2014 at 2:19 AM, Tyler Hobbs wrote: > > On Mon, May 5, 2014 at 10:27 AM

Re: DELETE does not delete :)

2013-10-06 Thread Nikolay Mihaylov
Hi my two cents - before doing anything else, make sure clocks are synchronized to the millisecond. ntp will do so. Nick. On Mon, Oct 7, 2013 at 9:02 AM, Alexander Shutyaev wrote: > Hi all, > > We have encountered the following problem with cassandra. > > * We use *cassandra v2.0.0* from *Data

cassandra hi bandwith

2013-09-10 Thread Nikolay Mihaylov
Hi, we have cassandra 1.2.6, single node. we have a website there, running on different server. recently we noticed that we have 40 MBit traffic from cassandra server to the web server we use phpcassa. on ops center we have "KeyCache Hits" value around 2000 . I found the most used CF's from n

Re: Random Distribution, yet Order Preserving Partitioner

2013-08-23 Thread Nikolay Mihaylov
mi-ordered partitioner - instead of single MD5, > to have two MD5's. > > Sounds interesting. But, we need a fully ordered result. > > Anyway, I will try with the latest version. > > Thanks, > Takenori > > > On Thu, Aug 22, 2013 at 6:12 PM, Nikolay Mihaylov wrote:

Re: Random Distribution, yet Order Preserving Partitioner

2013-08-22 Thread Nikolay Mihaylov
my five cents - token and key are not same. it was like this long time ago (single MD5 assumed single key) if you want ordered, you probably can arrange your data in a way so you can get it in ordered fashion. for example long ago, i had single column family with single key and about 2-3 M columns

Re: cassandra disk access

2013-08-07 Thread Nikolay Mihaylov
thanks It will use the Index Sample (RAM) first, then it will use "full" Index (disk) and finally it will read data from SSTable (disk). There's no such thing like "collision" in this case. so it still have 2 seeks :) where I can see the internal structure of the sstable i tried to find it docum

cassandra disk access

2013-08-07 Thread Nikolay Mihaylov
Hi I am researching various hash-tables and b-trees on disk. while I researched, I has a thoughts about cassandra sstables that I want to verify it here. 1. cassandra sstable uses sequential disk I/O when created. e.g. disk head write it from the beginning to the end. Assuming the disk is not fr

Re: unable to delete

2013-06-07 Thread Nikolay Mihaylov
Hi please note that when you drop column family, the data on the disk is not deleted. this is something you should do yourself. >> Do the files get deleted on GC/server restart? the question actually translates - do the column family existed after the restart? John pls correct me if I am explai

Re: column with TTL of 10 seconds lives very long...

2013-05-23 Thread Nikolay Mihaylov
Did you synchronized the clocks between servers? On Thu, May 23, 2013 at 9:32 AM, Tamar Fraenkel wrote: > Hi! > I have Cassandra cluster with 3 node running version 1.0.11. > > I am using Hector HLockManagerImpl, which creates a keyspace named > HLockManagerImpl and CF HLocks. > For some reason

Re: nodetool ring generate strange info

2013-05-10 Thread Nikolay Mihaylov
do you use vnodes ? On Fri, May 10, 2013 at 10:19 AM, 杨辉强 wrote: > Hi, all > I use ./bin/nodetool -h 10.21.229.32 ring > > It generates lots of info of same host like this: > > 10.21.229.32 rack1 Up Normal 928.3 MB24.80% >8875305964978355793 > 10.21.229.32 rack1

Re: How to find total number of rows in Cassandra databaase?

2013-04-22 Thread Nikolay Mihaylov
Hi it is very important to know that counting rows is very very very expensive. here is my 5 cents - in one of my early projects we made separate column family, with just single row. we inserted each row key from other CF on this row as column key. then once a day or who, we did get_count(). ho

Re: differences between DataStax Community Edition and Cassandra package

2013-04-19 Thread Nikolay Mihaylov
Is there alternative file systems running on top of cassandra? On Fri, Apr 19, 2013 at 7:44 PM, Robert Coli wrote: > On Fri, Apr 19, 2013 at 4:18 AM, Goktug YILDIRIM > wrote: > > > > I am sorry if this a very well know topic and I missed that. I wonder > why one must use CFS. What is unavaila

Re: differences between DataStax Community Edition and Cassandra package

2013-04-18 Thread Nikolay Mihaylov
FS.pdf<http://www.datastax.com/wp-content/uploads/2012/09/WP-DataStax-HDFSvsCFS.pdf>:-) > > W dniu 18.04.2013 14:10, Nikolay Mihaylov pisze: > > whats CDFS ? I am sure you are not referring iso9660, e.g. CD-ROM >> filesystem? :) >> >> >> On Wed, Apr 17, 20

Re: differences between DataStax Community Edition and Cassandra package

2013-04-18 Thread Nikolay Mihaylov
whats CDFS ? I am sure you are not referring iso9660, e.g. CD-ROM filesystem? :) On Wed, Apr 17, 2013 at 10:42 PM, Robert Coli wrote: > On Wed, Apr 17, 2013 at 11:19 AM, aaron morton wrote: > >> It's the same as the Apache version, but DSC comes with samples and the >> free version of Ops Centr

Re: running cassandra on 8 GB servers

2013-04-15 Thread Nikolay Mihaylov
dra-env.sh or > cassandra.yaml revert them. Generally C* will do the right thing and not > OOM, unless you are trying to store a lot of data on a node that does not > have enough memory. See this thread for background > http://www.mail-archive.com/user@cassandra.apache.org/msg25762.htm

Re: running cassandra on 8 GB servers

2013-04-11 Thread Nikolay Mihaylov
at does not > have enough memory. See this thread for background > http://www.mail-archive.com/user@cassandra.apache.org/msg25762.html > > Cheers > > - > Aaron Morton > Freelance Cassandra Consultant > New Zealand > > @aaronmorton > http://www.thelastpickl

running cassandra on 8 GB servers

2013-04-11 Thread Nikolay Mihaylov
For one project I will need to run cassandra on following dedicated servers: Single CPU XEON 4 cores no hyper-threading, 8 GB RAM, 12 TB locally attached HDD's in some kind of RAID, visible as single HDD. I can do cluster of 20-30 such servers, may be even more. The data will be huge, I am estim