integrating cassandra and haoop

2014-11-26 Thread Tim Dunphy
Hey all, I'd like to connect my cassandra 2.1.2 cluster to hadoop to have it process the data. Are there any good tutorials you can recommend on how to accomplish this? I'm running Centos 6.5 on my cassandra server and the hadoop name node is CentOS 7. Thanks Tim -- GPG me!! gpg --keyserver p

Re: Never running repair: No need vs consequences in our usage pattern

2014-11-26 Thread Robert Coli
On Wed, Nov 26, 2014 at 12:16 PM, Wayne Schroeder < wschroe...@pinsightmedia.com> wrote: > I have a 30+ node cluster that is under heavy read and write load. Based > on the fact that we never delete data, and all data is inserted with TTLs > and is somewhat temporal if not upserted, and we are fi

Never running repair: No need vs consequences in our usage pattern

2014-11-26 Thread Wayne Schroeder
I have a 30+ node cluster that is under heavy read and write load. Based on the fact that we never delete data, and all data is inserted with TTLs and is somewhat temporal if not upserted, and we are fine with the consistency of one and read repair chance, we elected to never repair. The reaso

Re: Use of line number and file name in default cassandra logging configuration

2014-11-26 Thread Robert Coli
On Wed, Nov 26, 2014 at 11:57 AM, Matt Brown wrote: > I created https://issues.apache.org/jira/browse/CASSANDRA-8379 and > attached patches against trunk and the cassandra-2.0 branch. > Sweet. Thanks for closing the loop and letting the list know the JIRA info. =Rob

Re: Use of line number and file name in default cassandra logging configuration

2014-11-26 Thread Matt Brown
I created https://issues.apache.org/jira/browse/CASSANDRA-8379 and attached patches against trunk and the cassandra-2.0 branch. > On Nov 26, 2014, at 2:05 PM, Robert Coli wrote: > > On Wed, Nov 26, 2014 at 10:39 AM, Matt Brown

Re: Repair completes successfully but data is still inconsistent

2014-11-26 Thread Robert Coli
On Wed, Nov 26, 2014 at 10:17 AM, André Cruz wrote: > Of these, the row in question was present on: > Disco-NamespaceFile2-ic-5337-Data.db - tombstone column > Disco-NamespaceFile2-ic-5719-Data.db - no trace of that column > Disco-NamespaceFile2-ic-5748-Data.db - live column with original timesta

Re: Use of line number and file name in default cassandra logging configuration

2014-11-26 Thread Robert Coli
On Wed, Nov 26, 2014 at 10:39 AM, Matt Brown wrote: > Both the log4j > > and logback documentation warn > that generating the filename/line information is not a cheap ope

Use of line number and file name in default cassandra logging configuration

2014-11-26 Thread Matt Brown
In the logging configuration that ships with the cassandra distribution (log4j-server.properties in 2.0, and logback.xml in 2.1), the rolling file appender is configured to print the file name and the line number of each logging event: log4j.appender.R.layout.ConversionPattern=%5p [%t] %d{I

Re: Repair completes successfully but data is still inconsistent

2014-11-26 Thread André Cruz
On 24 Nov 2014, at 18:54, Robert Coli wrote: > > But for any given value on any given node, you can verify the value it has in > 100% of SStables... that's what both the normal read path and repair should > do when reconciling row fragments into the materialized row? Hard to > understand a cas

Re: High cpu usage & segfaulting

2014-11-26 Thread Stan Lemon
Thanks everyone for the feedback. So some additional details... 1. Definitely using Oracle JDK (1.7.0_71-b14) 2. Yes, the segfaulting does go away after a restart 3. No OOM log messages when this occurs 4. We are seeing many GC pauses that take a long time, as in over 2 seconds - we are aware that

Re: High cpu usage & segfaulting

2014-11-26 Thread Tyler Hobbs
When I see a segfault, my first reaction is to always suspect OpenJDK. Are you using OpenJDK or the Oracle JDK? If you're using the former, I recommend the latter. On Tue, Nov 25, 2014 at 10:40 PM, Otis Gospodnetic < otis.gospodne...@gmail.com> wrote: > Hi Stan, > > Put some monitoring on this.

multiple threads updating result in TransportException

2014-11-26 Thread Brian Tarbox
We're running into a problem where things are fine if our client runs single threaded but gets TransportException if we use multiple threads. The datastax driver gets an NIO checkBounds error. Here is a link to a stack overflow question we found that describes the problem we're seeing. This quest