Re: host clocks

2011-07-24 Thread Edward Capriolo
You should always sync your host clocks. Clients provide timestamp but for
the server gc_grace and ttl columns can have issues if server clocks are not
correct.

On Sunday, July 24, 2011, 魏金仙  wrote:
> hi all,
>I'm launching a cassandra cluster with 30 nodes. I wonder whether the
inconsistency of host clocks will influence the performance of cluster.
>   Thanks!
>
>


Some commit logs not deleting

2011-07-24 Thread Chad Johnson
Hello,

We are running Cassandra 0.7.5 on a 15 node cluster, RF=3. We are having a
problem where some commit logs do not get deleted. Our write load generates
a new commit log about every two to three minutes. On average, one commit
log an hour is not deleted. Without draining, deleting the remaining commit
log files and restarting each node in the cluster, the commit log partition
will fill up. We do one thing with our cluster that is probably not very
common. We make schema changes four times per day. We cycle column families
by dropping old column families for old data we don't care about any longer
and creating new ones for new data.

Is anybody else see this problem? I can assume that the dirty bit for those
commit logs are still set, but why? How can I determine what CF memtable is
still dirty?

Please let me know if there is additional information I can provide and
thanks for your help.

Chad


host clocks

2011-07-24 Thread 魏金仙
hi all,
   I'm launching a cassandra cluster with 30 nodes. I wonder whether the 
inconsistency of host clocks will influence the performance of cluster.
  Thanks! 

Re: "select * from A join B using(common_id) where A.id == a and B.id == b "

2011-07-24 Thread aaron morton
> my fall-back approach is, since A and B do not change a lot, I'll
> pre-generate the join of A and B (not very large) keyed on A.id +
> B.id,
> then do the get(a+b)

+1 materialise views / joins you know you want ahead of time. Trade space for 
time. 

Cheers

-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 23 Jul 2011, at 10:41, Yang wrote:

> this is a common pattern used in RDMS,
> is there some existing idiom to do it in cassandra ?
> 
> 
> if the size of "select * from A where id == a " is very large, and
> similarly for B, while the join of A.id == a and B.id==b is small,
> then doing a get() for both and then merging seems excessively slow.
> 
> 
> my fall-back approach is, since A and B do not change a lot, I'll
> pre-generate the join of A and B (not very large) keyed on A.id +
> B.id,
> then do the get(a+b)
> 
> 
> thanks
> Yang



Re: question on setup for writes into 2 datacenters

2011-07-24 Thread aaron morton
Quick reminder, with RF == 2 the QUORUM is 2 as well. So when using 
LOCAL_QUORUM with RF 2+2 you will effectively be using LOCAL_ALL which may not 
be what you want. As De La Soul sang, 3 is the magic number for minimum fault 
tolerance (QUORUM is then 2). 

Cheers
  
-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 23 Jul 2011, at 10:04, Sameer Farooqui wrote:

> It sounds like what you're looking for is write consistency of local_quorum:
> http://www.datastax.com/docs/0.8/consistency/index#write-consistency
> 
> local_quorum would mean the write has to be successful on a majority of nodes 
> in DC1 (so 2) before it is considered successful.
> 
> If you use just quorum write, it'll have to be committed to 3 replicas out of 
> the 4 before it's considered successful.
> 
> 
> 
> 
> On Fri, Jul 22, 2011 at 1:57 PM, Dean Hiller  wrote:
> Ideally, we would want to have a replication factor of 4, and a minimum write 
> consistency of 2 (which looking at the default in cassandra.yaml is to memory 
> first with asynch to disk...perfect so far!!!)
> 
> Now, obviously, I can get the partitioner setup to make sure I get 2 replicas 
> in each data center.  The next thing I would want to guarantee however is 
> that if a write came into datacenter 1, it would write to the two nodes in 
> datacenter 1 and asynchronously replicate to datacenter 2.  Is this possible? 
>  Does cassandra already handle that or is there something I could do to get 
> cassandra to do that?
> 
> In this mode, I believe I can have both datacenters be live as well as be 
> backup for the other not wasting resources.
> 
> thanks,
> Dean
> 



Re: Counter consistency - are counters idempotent?

2011-07-24 Thread aaron morton
What's your use case ? There are people out there having good times with 
counters, see

http://www.slideshare.net/kevinweil/rainbird-realtime-analytics-at-twitter-strata-2011
http://www.scribd.com/doc/59830692/Cassandra-at-Twitter

Cheers

-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 23 Jul 2011, at 08:37, Aaron Turner wrote:

> On Fri, Jul 22, 2011 at 9:27 AM, Donal Zang  wrote:
>> On 22/07/2011 18:08, Yang wrote:
>>> 
>>> btw, this "issue" of  not knowing whether a write is persisted or not
>>> when client reports error, is not limited to counters,  for regular
>>> columns, it's the same: if client reports write failure, the value may
>>> well be replicated to all replicas later.  this is even the same with
>>> all other systems: Zookeeper, Paxos, ultimately due to the FLP
>>> theoretical result of "no guarantee of consensus in async systems"
>> 
>> yes, but with regular columns, retry is OK, while counter is not.
> 
> I know I've heard that fixing this issue is "hard".  I've assumed this
> to mean "don't expect a fix anytime soon".  Is that accurate?
> Beginning to start having second thoughts that Cassandra is the right
> fit for my project which would heavily rely on counters to roll up
> aggregates.
> 
> -- 
> Aaron Turner
> http://synfin.net/ Twitter: @synfinatic
> http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix & 
> Windows
> Those who would give up essential Liberty, to purchase a little temporary
> Safety, deserve neither Liberty nor Safety.
> -- Benjamin Franklin
> "carpe diem quam minimum credula postero"



Re: CL=N-1?

2011-07-24 Thread aaron morton
Sounds like you want RF == N more than CL == N-1. You could create a separate 
KS with RF == N and then just write at CL = QUORUM, the write will be sent to 
every UP replica. 

For background: The System Keyspace uses the o.a.c.locator.LocalStrategy that 
*only* stores data on the node the mutation is sent to. So there is *no 
replication*, the system takes care of ensuring data that needs to be 
replicated (i.e. schema defs) are replicated via  higher level features. Using 
the LocalStrategy yourself is probably a bad idea. 

Cheers

-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 23 Jul 2011, at 07:24, Yang wrote:

> is there such an option?
> 
> in some cases I want to distribute some small lookup tables to all the
> nodes, so that everyone has a local copy, and loaded in memory. so the
> lookup is fast. supposedly I want to write to all N nodes, but that
> exposes me to failure in case of just one node down.
> so I'd like to declare success to N-1 nodes.
> 
> thanks
> Yang



Re: Predictable low RW latency, SLABS and STW GC

2011-07-24 Thread aaron morton
Restarting the service will drop all the memmapped caches, cassandra caches are 
saved / persistent and you can also use memcachd if you want. 

Are you experiencing stop the world pauses? There are some things that can be 
done to reduce the chance of them happening. 

Cheers

-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 23 Jul 2011, at 05:34, Milind Parikh wrote:

> In order to be predicable @ big data scale, the intensity and periodicity of 
> STW Garbage Collection has to be brought down. Assume that SLABS (Cass 2252) 
> will be available in the main line at some time and assume that this will 
> have the impact that other projects (hbase etc) are reporting. I womder 
> whether avoiding GC by restarting the servers before GC will be a feasible 
> approach (of course while knowing the workload)
> 
> Regards
> Milind
> 



Re: do I need to add more nodes? minor compaction eat all IO

2011-07-24 Thread Jonathan Ellis
It's sequential per-sstable.  If you are compacting a lot of sstables
how closely this approximates "completely sequential" will
deteriorate.

On Sun, Jul 24, 2011 at 1:18 PM, Francois Richard  wrote:
> Jonathan,
>
> Are you sure that the reads done for compaction are sequential with Cassandra 
> 0.6.13?  This is not what I am observing right now.  During a minor 
> compaction I usually observe ~ 1500 to 1900 r/s while rMB/s is barely around 
> 30 to 35MB/s.
>
> Just asking out of curiosity.
>
>
> FR
>
> -Original Message-
> From: Jonathan Ellis [mailto:jbel...@gmail.com]
> Sent: Saturday, July 23, 2011 5:05 PM
> To: user@cassandra.apache.org
> Subject: Re: do I need to add more nodes? minor compaction eat all IO
>
> On Sat, Jul 23, 2011 at 4:16 PM, Francois Richard  wrote:
>> My understanding is that during compaction cassandra does a lot of non 
>> sequential readsa then dumps the results with a big sequential write.
>
> Compaction reads and writes are both sequential, and 0.8 allows setting a 
> MB/s to cap compaction at.
>
> As to the original question "do I need to add more machines" I'd say that 
> depends more on whether your application's SLA is met, than what % io util 
> spikes to.
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra support 
> http://www.datastax.com
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


Re: Little problems with Solandra

2011-07-24 Thread Jean-Nicolas Boulay Desjardins
Do I need to install Tomcat? Maybe that is the problem...

On Sat, Jul 23, 2011 at 12:03 PM, Jean-Nicolas Boulay Desjardins
 wrote:
> How is it called in the "ps"?
>
> Because I have a strong feeling that it tries to load... But for some
> reason that is be on me it does not.
>
> Is there a log file, or something?
>
> Thanks again.
>
> On Sat, Jul 23, 2011 at 12:02 PM, Jean-Nicolas Boulay Desjardins
>  wrote:
>> No, I did start it.
>>
>> This is what I got when I started it:
>>
>> INFO 16:00:38,053 Logging initialized
>>  INFO 16:00:38,119 Sucessfully Hijacked FieldCacheImpl
>>  INFO 16:00:38,151 Logging initialized
>>  INFO 16:00:38,175 Heap size: 124780544/124780544
>>
>> And then it stop... So I typed enter on the keyboard... And I was back
>> in the cmd...
>>
>> I don't know if that's normal?
>>
>> Thanks again.
>>
>> On Sat, Jul 23, 2011 at 11:33 AM, Jake Luciani  wrote:
>>> Sounds like you forgot to start solandra after you built it.
>>>
>>> cd solandra-app; ./bin/solandra
>>>
>>> You can verify it's running with jps look for SolandraServer.
>>>
>>>
>>>
>>> On Jul 23, 2011, at 10:52 AM, Jean-Nicolas Boulay Desjardins 
>>>  wrote:
>>>
 Hi,

 I have a server on RackSpace and it seems that when I use "ant" it
 makes Apache2 crash. I don't if this is normal?

 Maybe it's because I have 256MB for RAM. Could it be?

 Should I get more RAM?

 Also, when I use the command "ps -A" I don't seem to be able to
 identify which is Solandra... How can I know if Solandra is running...
 Because I have this gut feeling that it's not running, maybe because
 of the lack of RAM...

 That's no all, when I do: "# sh ./2-import-data.sh" I get this nasty
 little error:

 curl: (7) couldn't connect to host
 Posted schema.xml to http://localhost:8983/solandra/schema/reuters
 Loading data to solandra, note: this importer uses a slow xml parser
 Jul 23, 2011 2:48:17 PM
 org.apache.commons.httpclient.HttpMethodDirector executeWithRetry
 INFO: I/O exception (java.net.ConnectException) caught when processing
 request: Connection refused
 Jul 23, 2011 2:48:17 PM
 org.apache.commons.httpclient.HttpMethodDirector executeWithRetry
 INFO: Retrying request
 Jul 23, 2011 2:48:17 PM
 org.apache.commons.httpclient.HttpMethodDirector executeWithRetry
 INFO: I/O exception (java.net.ConnectException) caught when processing
 request: Connection refused
 Jul 23, 2011 2:48:17 PM
 org.apache.commons.httpclient.HttpMethodDirector executeWithRetry
 INFO: Retrying request
 Jul 23, 2011 2:48:17 PM
 org.apache.commons.httpclient.HttpMethodDirector executeWithRetry
 INFO: I/O exception (java.net.ConnectException) caught when processing
 request: Connection refused
 Jul 23, 2011 2:48:17 PM
 org.apache.commons.httpclient.HttpMethodDirector executeWithRetry
 INFO: Retrying request
 Exception in thread "main" java.lang.RuntimeException: unable to
 connect to solr server: http://localhost:8983/solandra/reuters
    at 
 org.apache.solr.solrjs.sgml.reuters.ReutersService.(ReutersService.java:93)
    at 
 org.apache.solr.solrjs.sgml.reuters.ReutersService.main(ReutersService.java:63)
 Caused by: org.apache.solr.client.solrj.SolrServerException:
 java.net.ConnectException: Connection refused
    at 
 org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:391)
    at 
 org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:183)
    at 
 org.apache.solr.client.solrj.request.SolrPing.process(SolrPing.java:60)
    at org.apache.solr.client.solrj.SolrServer.ping(SolrServer.java:105)
    at 
 org.apache.solr.solrjs.sgml.reuters.ReutersService.(ReutersService.java:91)
    ... 1 more
 Caused by: java.net.ConnectException: Connection refused
    at java.net.PlainSocketImpl.socketConnect(Native Method)
    at 
 java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:327)
    at 
 java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:193)
    at 
 java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:180)
    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:384)
    at java.net.Socket.connect(Socket.java:546)
    at java.net.Socket.connect(Socket.java:495)
    at java.net.Socket.(Socket.java:392)
    at java.net.Socket.(Socket.java:266)
    at 
 org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:80)
    at 
 org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:122)
    at 
 org.apache.commons.httpclient.HttpConnection.open(HttpConnection.java:707)
    at 
 org.apache.commons.httpclient.MultiThreadedHt

RE: do I need to add more nodes? minor compaction eat all IO

2011-07-24 Thread Francois Richard
Jonathan,

Are you sure that the reads done for compaction are sequential with Cassandra 
0.6.13?  This is not what I am observing right now.  During a minor compaction 
I usually observe ~ 1500 to 1900 r/s while rMB/s is barely around 30 to 35MB/s.

Just asking out of curiosity.


FR

-Original Message-
From: Jonathan Ellis [mailto:jbel...@gmail.com] 
Sent: Saturday, July 23, 2011 5:05 PM
To: user@cassandra.apache.org
Subject: Re: do I need to add more nodes? minor compaction eat all IO

On Sat, Jul 23, 2011 at 4:16 PM, Francois Richard  wrote:
> My understanding is that during compaction cassandra does a lot of non 
> sequential readsa then dumps the results with a big sequential write.

Compaction reads and writes are both sequential, and 0.8 allows setting a MB/s 
to cap compaction at.

As to the original question "do I need to add more machines" I'd say that 
depends more on whether your application's SLA is met, than what % io util 
spikes to.

--
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support 
http://www.datastax.com


Re: Cassandra 0.7.8 and 0.8.1 fail when major compaction on 37GB database

2011-07-24 Thread Mina Naguib

From experience with similar-sized data sets, 1.5GB may be too little.  
Recently I bumped our java HEAP limit from 3GB to 4GB to get past an OOM doing 
a major compaction.

Check "nodetool -h localhost info" while the compaction is running for a simple 
view into the memory state.

If you can, also hook in jconsole and you'll get a better view, over time, of 
how cassandra's memory usage trends, the effect of GC, and the pressure of 
various operations such as compactions.


On 2011-07-24, at 8:08 AM, lebron james wrote:

>   Hi, Please help me with my problem. For better performance i turn off 
> compaction and run massive inserts, after database reach 37GB i stop massive 
> inserts and start compaction with "NodeTool compaction Keyspace CFamily". 
> after half hour of work cassandra fall with error "Out of memory" i give 
> 1500M to JVM, all parameters in yaml file are default. testing OS ubuntu 
> 11.04 and windows server 2008 dc edition. Thanks!  



Re: Cassandra 0.7.8 and 0.8.1 fail when major compaction on 37GB database

2011-07-24 Thread Edward Capriolo
On Sunday, July 24, 2011, lebron james  wrote:
>   Hi, Please help me with my problem. For better performance i turn off
compaction and run massive inserts, after database reach 37GB i stop massive
inserts and start compaction with "NodeTool compaction Keyspace CFamily".
after half hour of work cassandra fall with error "Out of memory" i give
1500M to JVM, all parameters in yaml file are default. testing OS ubuntu
11.04 and windows server 2008 dc edition. Thanks!
>

Lebron good to see you have taken you talents from south beach to big data.

Though 1500mb is a lot of memory it is not a lot for cassandra. In
particular a burst of requests causes more memory allocation. Compaction is
intensive on the disk and memory system.

There are many things you can do to lower caches,optimize memtables, and
tune jvms. If you can spare more jvm memory that might help otherwise you
have a companion bug.


Re: After column deletion cassandra won't insert more data to a specific key

2011-07-24 Thread Edward Capriolo
Remember the cli uses microsecond precision . so if your app is not using
the same precision weird this will result in clients writing the biggest
timsetamp winning the final value.

On Saturday, July 23, 2011, Jonathan Ellis  wrote:
> You must have given it a delete timestamp in the "future."
>
> On Sat, Jul 23, 2011 at 3:46 PM, Guillermo Winkler
>  wrote:
>> I'm having a strange behavior on one of my cassandra boxes, after all
>> columns are removed from a row, insertion on that key stops working (from
>> API and from the cli)
>> [default@Agent] get Schedulers['atendimento'];
>> Returned 0 results.
>> [default@Agent] set Schedulers['atendimento']['test'] = 'dd';
>> Value inserted.
>> [default@Agent] get Schedulers['atendimento'];
>> Returned 0 results.
>> Already tried nodetool flush/compact/repair on the CF, doesn't fix the
>> problem.
>> With a ver simple setup:
>> * only one node in the cluster (the cluster never had more nodes nor
>> replicas)
>> * random partitioner
>> * CF defined as "create column family Schedulers with
comparator=BytesType;"
>> The only way for it to start working again is to truncate the CF.
>> Do you have any clues how to diagnose what's going on?
>> Thanks,
>> Guille
>>
>>
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra support
> http://www.datastax.com
>


Cassandra 0.7.8 and 0.8.1 fail when major compaction on 37GB database

2011-07-24 Thread lebron james
  Hi, Please help me with my problem. For better performance i turn off
compaction and run massive inserts, after database reach 37GB i stop massive
inserts and start compaction with "NodeTool compaction Keyspace CFamily".
after half hour of work cassandra fall with error "Out of memory" i give
1500M to JVM, all parameters in yaml file are default. testing OS ubuntu
11.04 and windows server 2008 dc edition. Thanks!


Re: CompositeType for row Keys

2011-07-24 Thread David Boxenhorn
Why do you need another CF? Is there something wrong with repeating the key
as a column and indexing it?

On Fri, Jul 22, 2011 at 7:40 PM, Patrick Julien  wrote:

> Exactly.  In any case, I just answered my own question.  If I need
> range, I can just make another column family where the column name are
> these keys
>
> On Fri, Jul 22, 2011 at 12:37 PM, Nate McCall  wrote:
> >> yes,but why would you use CompositeType if you don't need range query?
> >
> > If you were doing composite keys anyway (common approach with time
> > series data for example), you would not have to write parsing and
> > concatenation code. Particularly useful if you had mixed types in the
> > key.
> >
>