Re: cassandra.yaml not picked up?

2010-05-18 Thread Nathan McCall
I came across this the other day as well. The following FAQ will get you going:
http://wiki.apache.org/cassandra/FAQ#no_keyspaces

-Nate

On Tue, May 18, 2010 at 9:24 AM, Frank Du  wrote:
> Hey All,
>
> I tried to run from cassandra trunk source. The keyspace schema has changed
> to cassandra.yaml. However, the defined Keyspace1 doesn't seem to be picked
> by cassandra. Is there any additional work to ask cassandra use it? Thank
> you so much!
>
> Best Regards,
> Frank
>


Re: paging row keys

2010-05-12 Thread Nathan McCall
Oh, thanks to Andrey Panov for providing that example, btw. We are
always looking for good usage examples to post on the Hector wiki If
anyone else has them.
-Nate

On Wed, May 12, 2010 at 5:01 PM, Nathan McCall  wrote:
> Here is a basic example using get_range_slices to retrieve 500 rows via 
> hector:
> http://github.com/bosyak/hector/blob/master/src/main/java/me/prettyprint/cassandra/examples/ExampleReadAllKeys.java
>
> To page, use the last key you got back as the start key.
>
> -Nate
>
> On Wed, May 12, 2010 at 3:37 PM, Corey Hulen  wrote:
>>
>> Can someone point me to a thrift sample (preferable java) to list all the
>> rows in a ColumnFamily for my Cassandra server.  I noticed some examples
>> using SlicePredicate and SliceRange to perform a similar query against the
>> columns with paging, but I was looking for something similar for rows with
>> paging.  I'm also new to Cassandra, so I might have misunderstood the
>> existing samples.
>> -Corey
>>
>>
>>
>


Re: paging row keys

2010-05-12 Thread Nathan McCall
Here is a basic example using get_range_slices to retrieve 500 rows via hector:
http://github.com/bosyak/hector/blob/master/src/main/java/me/prettyprint/cassandra/examples/ExampleReadAllKeys.java

To page, use the last key you got back as the start key.

-Nate

On Wed, May 12, 2010 at 3:37 PM, Corey Hulen  wrote:
>
> Can someone point me to a thrift sample (preferable java) to list all the
> rows in a ColumnFamily for my Cassandra server.  I noticed some examples
> using SlicePredicate and SliceRange to perform a similar query against the
> columns with paging, but I was looking for something similar for rows with
> paging.  I'm also new to Cassandra, so I might have misunderstood the
> existing samples.
> -Corey
>
>
>


Re: Tuning Cassandra

2010-05-10 Thread Nathan McCall
David,
Are you using batchMutate or insert? Jump on over to hector-users if
you want API help with either of these.

-Nate


Re: replacing columns via remove and insert

2010-05-06 Thread Nathan McCall
This was on hector-users this morning as well. It has been addressed
and is now in trunk.

-Nate

On Thu, May 6, 2010 at 2:35 PM, Jonathan Ellis  wrote:
> That's kind of an odd API wart for Hector.  You should file an issue
> on http://github.com/rantav/hector
>
> On Thu, May 6, 2010 at 11:36 AM, Jonathan Shook  wrote:
>> I found the issue. Timestamp ordering was broken because:
>> I generated a timestamp for the group of operations. Then, I used
>> hector's remove, which generates its own internal timestamp.
>> I then re-used the timestamp, not wary of the missing timestamp field
>> on the remove operation.
>>
>> The fix was to simply regenerate my timestamp after any hector
>> operation which generates its own.
>>
>> In my case, hector generates it's own internal timestamp for removes,
>> but not other operations. Until the timestamp resolution is better
>> than milliseconds, it's very possible to end up with the same
>> timestamp for tightly grouped operations, which may lead to unexpected
>> behavior. I've submitted a request to simplify this.
>>
>> On Wed, May 5, 2010 at 5:03 PM, Jonathan Shook  wrote:
>>> When I try to replace a set of columns, like this:
>>>
>>> 1) remove all columns under a CF/row
>>> 2) batch insert columns into the same CF/row
>>> .. the columns cease to exist.
>>>
>>> Is this expected?
>>>
>>> This is just across 2 nodes with Replication Factor 2 and Consistency
>>> Level QUOROM.
>>>
>>
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of Riptano, the source for professional Cassandra support
> http://riptano.com
>


Re: cassandra jvm crash in GCTaskThread

2010-05-05 Thread Nathan McCall
Ran,
You may want to upgrade your jvm. I had a similar core with a heavily
loaded tomcat recently and found the following on the release notes
page for 1.6.0_18:
"Card-Marking Optimization Issue

A flaw in the implementation of a card-marking performance
optimization in the JVM can cause heap corruption under some
circumstances. This issue affects the CMS garbage collector prior to
6u18, and the CMS, G1 and Parallel Garbage Collectors in 6u18. The
serial garbage collector is not affected. Applications most likely to
be affected by this issue are those that allocate very large objects
which would not normally fit in Eden, or those that make extensive use
of JNI Critical Sections (JNI Get/Release*Critical)."

http://java.sun.com/javase/6/webnotes/6u18.html

Note that there is an apparent work-around jvm option to add:
-XX:-ReduceInitialCardMarks

-Nate


On Tue, May 4, 2010 at 11:24 PM, Ran Tavory  wrote:
> Running a cluster of 0.6.1, one of the hosts crashed during GC.
> Can anyone suggest what needs tweaking to prevent that?
> The last configuration change I made was to
> change standard (it was auto); but I can't
> determine if it's related.
>
> System is 64 bit centos, the rest of the system info is at the bottom of the
> email.
> The last lines before the crash from system.log are:
>
>  INFO [COMPACTION-POOL:1] 2010-05-04 17:01:49,913 CompactionManager.java
> (line 246) Compacting
> [org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvImpressions-607-Data.db'),org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvImpressions-608-Data.db'),org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvImpressions-609-Data.db'),org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvImpressions-610-Data.db')]
>  INFO [COMMIT-LOG-WRITER] 2010-05-04 17:01:49,920 CommitLog.java (line 407)
> Discarding obsolete commit
> log:CommitLogSegment(/outbrain/cassandra/commitlog/CommitLog-1273006181463.log)
>  INFO [COMPACTION-POOL:1] 2010-05-04 17:02:09,820 CompactionManager.java
> (line 326) Compacted to
> /outbrain/cassandra/data/outbrain_kvdb/KvImpressions-611-Data.db.
>  291072443/291071793 bytes for 123292 keys.  Time: 19906ms.
> And a summary of hs_err is here. The rest is attached. Let me know if you've
> see this before and can advice, thanks!
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  SIGSEGV (0xb) at pc=0x2add529adb76, pid=943, tid=1112779072
> #
> # JRE version: 6.0_17-b04
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (14.3-b01 mixed mode
> linux-amd64 )
> # Problematic frame:
> # V  [libjvm.so+0x1ffb76]
> #
> # If you would like to submit a bug report, please visit:
> #   http://java.sun.com/webapps/bugreport/crash.jsp
> #
> ---  T H R E A D  ---
> Current thread (0x47276800):  GCTaskThread [stack:
> 0x,0x] [id=974]
> siginfo:si_signo=SIGSEGV: si_errno=0, si_code=128 (),
> si_addr=0x
>
> ---  P R O C E S S  ---
> Java Threads: ( => current thread )
>   0x47a08800 JavaThread "FILEUTILS-DELETE-POOL:1" [_thread_blocked,
> id=1297, stack(0x43e54000,0x43f55000)]
>   0x47ab9000 JavaThread "CACHETABLE-TIMER-3" daemon
> [_thread_blocked, id=1270, stack(0x49dbf000,0x49ec)]
>   0x47691000 JavaThread "Thread-26" [_thread_in_native, id=1191,
> stack(0x43d53000,0x43e54000)]
>   0x476ac000 JavaThread "Thread-19" [_thread_in_native, id=1117,
> stack(0x48daf000,0x48eb)]
>   0x47a42800 JavaThread "WRITE-/192.168.252.61" [_thread_blocked,
> id=1110, stack(0x49cbe000,0x49dbf000)]
>   0x47abc800 JavaThread "WRITE-/192.168.252.62" [_thread_blocked,
> id=1109, stack(0x49bbd000,0x49cbe000)]
>   0x2aabc841b800 JavaThread "Hint delivery" [_thread_blocked, id=1108,
> stack(0x49abc000,0x49bbd000)]
>   0x2aabc841a800 JavaThread "HINTED-HANDOFF-POOL:1" [_thread_blocked,
> id=1107, stack(0x499bb000,0x49abc000)]
>   0x47693800 JavaThread "Thread-16" [_thread_in_native, id=1106,
> stack(0x498ba000,0x499bb000)]
>   0x47693000 JavaThread "WRITE-/192.168.252.99" [_thread_blocked,
> id=1105, stack(0x497b9000,0x498ba000)]
>   0x47b05000 JavaThread "Thread-15" [_thread_blocked, id=1104,
> stack(0x496b8000,0x497b9000)]
>   0x47a7b000 JavaThread "Thread-14" [_thread_in_native, id=1103,
> stack(0x495b7000,0x496b8000)]
>   0x2aabc8056000 JavaThread "WRITE-/192.168.254.59" [_thread_blocked,
> id=1102, stack(0x494b6000,0x495b7000)]
>   0x2aabccf61800 JavaThread "Thread-13" [_thread_in_native, id=1101,
> stack(0x493b5000,0x494b6000)]
>   0x2aabccf5f000 JavaThread "W

Re: performance tuning - where does the slowness come from?

2010-05-04 Thread Nathan McCall
You could try mmap_index_only - this would restrict mmap usage to the
index files.

-Nate

On Tue, May 4, 2010 at 11:57 AM, Ran Tavory  wrote:
> I canceled mmap and indeed memory usage is sane again. So far performance
> hasn't been great, but I'll wait and see.
> I'm also interested in a way to cap mmap so I can take advantage of it but
> not swap the host to death...
>
> On Tue, May 4, 2010 at 9:38 PM, Kyusik Chung 
> wrote:
>>
>> This sounds just like the slowness I was asking about in another thread -
>> after a lot of reads, the machine uses up all available memory on the box
>> and then starts swapping.
>> My understanding was that mmap helps greatly with read and write perf
>> (until the box starts swapping I guess)...is there any way to use mmap and
>> cap how much memory it takes up?
>> What do people use in production?  mmap or no mmap?
>> Thanks!
>> Kyusik Chung
>> On May 4, 2010, at 10:11 AM, Schubert Zhang wrote:
>>
>> 1. When initially startup your nodes, please plan your InitialToken of
>> each node evenly.
>> 2. standard
>>
>> On Tue, May 4, 2010 at 9:09 PM, Boris Shulman  wrote:
>>>
>>> I think that the extra (more than 4GB) memory usage comes from the
>>> mmaped io, that is why it happens only for reads.
>>>
>>> On Tue, May 4, 2010 at 2:02 PM, Jordan Pittier 
>>> wrote:
>>> > I'm facing the same issue with swap. It only occurs when I perform read
>>> > operations (write are very fast :)). So I can't help you with the
>>> > memory
>>> > probleme.
>>> >
>>> > But to balance the load evenly between nodes in cluster just manually
>>> > fix
>>> > their token.(the "formula" is i * 2^127 / nb_nodes).
>>> >
>>> > Jordzn
>>> >
>>> > On Tue, May 4, 2010 at 8:20 AM, Ran Tavory  wrote:
>>> >>
>>> >> I'm looking into performance issues on a 0.6.1 cluster. I see two
>>> >> symptoms:
>>> >> 1. Reads and writes are slow
>>> >> 2. One of the hosts is doing a lot of GC.
>>> >> 1 is slow in the sense that in normal state the cluster used to make
>>> >> around 3-5k read and writes per second (6-10k operations per second),
>>> >> but
>>> >> how it's in the order of 200-400 ops per second, sometimes even less.
>>> >> 2 looks like this:
>>> >> $ tail -f /outbrain/cassandra/log/system.log
>>> >>  INFO [GC inspection] 2010-05-04 00:42:18,636 GCInspector.java (line
>>> >> 110)
>>> >> GC for ParNew: 672 ms, 166482384 reclaimed leaving 2872087208 used;
>>> >> max is
>>> >> 4432068608
>>> >>  INFO [GC inspection] 2010-05-04 00:42:28,638 GCInspector.java (line
>>> >> 110)
>>> >> GC for ParNew: 498 ms, 166493352 reclaimed leaving 2836049448 used;
>>> >> max is
>>> >> 4432068608
>>> >>  INFO [GC inspection] 2010-05-04 00:42:38,640 GCInspector.java (line
>>> >> 110)
>>> >> GC for ParNew: 327 ms, 166091528 reclaimed leaving 2796888424 used;
>>> >> max is
>>> >> 4432068608
>>> >> ... and it goes on and on for hours, no stopping...
>>> >> The cluster is made of 6 hosts, 3 in one DC and 3 in another.
>>> >> Each host has 8G RAM.
>>> >> -Xmx=4G
>>> >> For some reason, the load isn't distributed evenly b/w the hosts,
>>> >> although
>>> >> I'm not sure this is the cause for slowness
>>> >> $ nodetool -h localhost -p 9004 ring
>>> >> Address       Status     Load          Range
>>> >>        Ring
>>> >>
>>> >> 144413773383729447702215082383444206680
>>> >> 192.168.252.99Up         15.94 GB
>>> >>  66002764663998929243644931915471302076     |<--|
>>> >> 192.168.254.57Up         19.84 GB
>>> >>  81288739225600737067856268063987022738     |   ^
>>> >> 192.168.254.58Up         973.78 MB
>>> >> 86999744104066390588161689990810839743     v   |
>>> >> 192.168.252.62Up         5.18 GB
>>> >> 88308919879653155454332084719458267849     |   ^
>>> >> 192.168.254.59Up         10.57 GB
>>> >>  142482163220375328195837946953175033937    v   |
>>> >> 192.168.252.61Up         11.36 GB
>>> >>  144413773383729447702215082383444206680    |-->|
>>> >> The slow host is 192.168.252.61 and it isn't the most loaded one.
>>> >> The host is waiting a lot on IO and the load average is usually 6-7
>>> >> $ w
>>> >>  00:42:56 up 11 days, 13:22,  1 user,  load average: 6.21, 5.52, 3.93
>>> >> $ vmstat 5
>>> >> procs ---memory-- ---swap-- -io --system--
>>> >> -cpu--
>>> >>  r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us
>>> >> sy id
>>> >> wa st
>>> >>  0  8 2147844  45744   1816 4457384    6    5    66    32    5    2  1
>>> >>  1
>>> >> 96  2  0
>>> >>  0  8 2147164  49020   1808 4451596  385    0  2345    58 3372 9957  2
>>> >>  2
>>> >> 78 18  0
>>> >>  0  3 2146432  45704   1812 4453956  342    0  2274   108 3937 10732
>>> >>  2  2
>>> >> 78 19  0
>>> >>  0  1 2146252  44696   1804 4453436  345  164  1939   294 3647 7833  2
>>> >>  2
>>> >> 78 18  0
>>> >>  0  1 2145960  46924   1744 4451260  158    0  2423   122 4354 14597
>>> >>  2  2
>>> >> 77 18  0
>>> >>  7  1 2138344  44676    952 4504148 1722  403  1722   406 1388  439 87
>>> >>  0
>>> >> 10  2  0
>>> >>  7  2 2137248  45652    956 4499436 1384

Re: Problem with JVM? concurrent mode failure

2010-04-28 Thread Nathan McCall
 -XX:CMSInitiatingOccupancyFraction=75

It's at 90 by default on most systems. Turning this down to the above
would trigger the incremental collection to occur 75% full as opposed
to %90.

-Nate

On Wed, Apr 28, 2010 at 7:44 AM, Jonathan Ellis  wrote:
> Yes, incremental mode is definitely contraindicated for Cassandra.
>
> On Wed, Apr 28, 2010 at 1:07 AM, Peter Schuller
>  wrote:
>>>     -XX:+CMSIncrementalMode \
>>>     -XX:+CMSIncrementalPacing \
>>
>> This may  not be an issue given your other VM opts, but just FYI I
>> have had some difficulty making the incremental CMS mode perform GC
>> work sufficiently aggressively to avoid concurrent mode failures
>> during significant garbage generation.
>>
>> You may want to give CMS a try in non-incremental mode (i.e., the
>> concurrent mark phase will be done in one burst, rather than spread
>> out over time, but still concurrently).
>>
>> --
>> / Peter Schuller
>>
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of Riptano, the source for professional Cassandra support
> http://riptano.com
>


Re: Cassandra reverting deletes?

2010-04-27 Thread Nathan McCall
Have you confirmed that your clocks are all synced in the cluster?
This may be the result of an unintentional read-repair occurring if
that were the case.

-Nate

On Tue, Apr 27, 2010 at 2:20 PM, Joost Ouwerkerk  wrote:
> Hmm... Even after deleting with cl.ALL, I'm getting data back for some
> rows after having deleted them.  Which rows return data is
> inconsistent from one run of the job to the next.
>
> On Tue, Apr 27, 2010 at 1:44 PM, Joost Ouwerkerk  wrote:
>> To check that rows are gone, I check that KeySlice.columns is empty.  And as
>> I mentioned, immediately after the delete job, this returns the expected
>> number.
>> Unfortunately I reproduced with QUORUM this morning.  No node outages.  I am
>> going to try ALL to see if that changes anything, but I am starting to
>> wonder if I'm doing something else wrong.
>> On Mon, Apr 26, 2010 at 9:45 PM, Jonathan Ellis  wrote:
>>>
>>> How are you checking that the rows are gone?
>>>
>>> Are you experiencing node outages during this?
>>>
>>> DC_QUORUM is unfinished code right now, you should avoid using it.
>>> Can you reproduce with normal QUORUM?
>>>
>>> On Sat, Apr 24, 2010 at 12:23 PM, Joost Ouwerkerk 
>>> wrote:
>>> > I'm having trouble deleting rows in Cassandra.  After running a job that
>>> > deletes hundreds of rows, I run another job that verifies that the rows
>>> > are
>>> > gone.  Both jobs run correctly.  However, when I run the verification
>>> > job an
>>> > hour later, the rows have re-appeared.  This is not a case of "ghosting"
>>> > because the verification job actually checks that there is data in the
>>> > columns.
>>> >
>>> > I am running a cluster with 12 nodes and a replication factor of 3.  I
>>> > am
>>> > using DC_QUORUM consistency when deleting.
>>> >
>>> > Any ideas?
>>> > Joost.
>>> >
>>>
>>>
>>>
>>> --
>>> Jonathan Ellis
>>> Project Chair, Apache Cassandra
>>> co-founder of Riptano, the source for professional Cassandra support
>>> http://riptano.com
>>
>>
>


Re: Cassandra Java Client

2010-04-20 Thread Nathan McCall
Dop,
Thank you for trying out hector. I think you have the right approach
for using it with your project. Feel free to ping us directly
regarding Hector on either of these mailings lists as appropriate:
http://wiki.github.com/rantav/hector/mailing-lists

Cheers,
-Nate

On Tue, Apr 20, 2010 at 7:11 AM, Dop Sun  wrote:
> Hi,
>
>
>
> I have downloaded hector-0.6.0-10.jar. As you mentioned, it has good
> implementation for the connection pooling, JMX counters.
>
>
>
> What I’m doing is: using Hector to create the Cassandra client (be specific:
> borrow_client(url, port)). And my understanding is: in this way, the
> Jassandra will enjoy the client pool and JMX counter.
>
>
>
> http://code.google.com/p/jassandra/issues/detail?id=17
>
>
>
> Please feel free to let me know if you have any suggestions.
>
>
>
> The new build 1.0.0 build 3(http://code.google.com/p/jassandra/) created.
> From Jassandra client side, no API changes.
>
>
>
> Cheers~~~
>
> Dop
>
>
>
> From: Ran Tavory [mailto:ran...@gmail.com]
> Sent: Tuesday, April 20, 2010 1:36 AM
> To: user@cassandra.apache.org
> Subject: Re: Cassandra Java Client
>
>
>
> Hi Dop, you may want to look at hector as a low level cassandra client on
> which you build jassandra, adding hibernate style magic etc like other ppl
> have done with ORM layers on top of it.
>
> Hector's main features include extensive jmx counters, failover and
> connection pooling.
>
> It's available for all recent versions, including 0.5.0, 0.5.1, 0.6.0 and
> 0.6.1
>
> On Mon, Apr 19, 2010 at 5:58 PM, Dop Sun  wrote:
>
> Well, there are couple of points while Jassandra is created:
>
> 1. First of all, I want to create something like that is because I come from
> JDBC background, and familiar with Hibernate API. The ICriteria (which is
> created for querying) is inspired by the Criteria API from hibernate.
>
> Actually, maybe because of this background, it cost me a lot efforts try to
> understand Cassandra in the beginning and Thrift API also takes time to use.
>
> 2. The Jassandra creates a layer, which removes the direct link to
> underlying Thrift API (including the exceptions, ConsistencyLevel
> enumeration etc)
>
> High light this point because I believe the client of the Jassandra will
> benefit for the implementation changes in future, for example, if the
> Cassandra provides better Thrift API to selecting the columns for a list of
> keys, SCFs, or deprecating some structures, exceptions, the client may not
> be changed. Of cause, if Jassandra failed to approve itself, this is
> actually not the advantage. :)
>
> 3. The Jassandra is designed to be an JDBC like API, no less, no more. It
> strives to use the best API to do the quering (with token, key, SCF/ CF),
> doing the CRUD, but no more than that. For example, it does not cover any
> API like object mapping. But it should cover all the API functionalities
> Thrift provided.
>
> These 3 points, are different from Hector (I should be honest that I have
> not tried to use it before, the feeling of difference are coming from the
> sample code Hector provided).
>
> So, the API Jassandra abstracted was something like this:
>
>    IConnection connection = DriverManager.getConnection(
>        "thrift://localhost:9160", info);
>    try {
>      // 2. Get a KeySpace by name
>      IKeySpace keySpace = connection.getKeySpace("Keyspace1");
>
>      // 3. Get a ColumnFamily by name
>      IColumnFamily cf = keySpace.getColumnFamily("Standard2");
>
>      // 4. Insert like this
>      long now = System.currentTimeMillis();
>      ByteArray nameFirst = ByteArray.ofASCII("first");
>      ByteArray nameLast = ByteArray.ofASCII("last");
>      ByteArray nameAge = ByteArray.ofASCII("age");
>      ByteArray valueLast = ByteArray.ofUTF8("Smith");
>      IColumn colFirst = new Column(nameFirst, ByteArray.ofUTF8("John"),
> now);
>      cf.insert(userName, colFirst);
>
>      IColumn colLast = new Column(nameLast, valueLast, now);
>      cf.insert(userName, colLast);
>
>      IColumn colAge = new Column(nameAge, ByteArray.ofLong(42), now);
>      cf.insert(userName, colAge);
>
>      // 5. Select like this
>      ICriteria criteria = cf.createCriteria();
>      criteria.keyList(Lists.newArrayList(userName))
>          .columnRange(nameAge, nameLast, 10);
>      Map> map = criteria.select();
>      List list = map.get(userName);
>      Assert.assertEquals(3, list.size());
>      Assert.assertEquals(valueLast, list.get(2).getValue());
>
>      // 6. Delete like this
>      cf.delete(userName, colFirst);
>      map = criteria.select();
>      Assert.assertEquals(2, map.get(userName).size());
>
>      // 7. Get count like this
>      criteria = cf.createCriteria();
>      criteria.keyList(Lists.newArrayList(userName));
>      int count = criteria.count();
>      Assert.assertEquals(2, count);
>    } finally {
>      // 8. Don't forget to close the connection.
>      connection.close();
>
>    }
>  }
>
> -Original Message-
> From: Jonathan Ellis [mailto:jb

Re: get_range_slices in hector

2010-04-19 Thread Nathan McCall
Not yet. If you wanted to provide a patch that would be much
appreciated. A fork and pull request would be best logistically, but
whatever works.

-Nate

On Mon, Apr 19, 2010 at 5:10 PM, Chris Dean  wrote:
> Is there a version of hector that has an interface to get_range_slices ?
> or should I provide a patch?
>
> Cheers,
> Chris Dean
>


Re: Is that possible to write a file system over Cassandra?

2010-04-15 Thread Nathan McCall
In regards to hector, please check all the available branches on
github. We have supported 0.6 for a little while now.

http://github.com/rantav/hector/tree/0.6.0

The master is still based on 0.5, but that is changing in the next
couple of days to match the 0.6 release.

-Nate




On Thu, Apr 15, 2010 at 6:35 PM, Jeff Zhang  wrote:
> Jonathan,
>
> Previously we use the cassandra-0.6, but we'd like to leverage the hector
> java client since it has more advanced features. And hector currently only
> support cassandra-0.5.
> Why you think using casandra-0.5 is a stange way to do it ? Is cassandra-0.6
> incompatibility with cassandra-0.5 ? The migration to cassandra-0.6 will
> cost much ?
>
>
> On Thu, Apr 15, 2010 at 11:50 AM, Jonathan Ellis  wrote:
>>
>> You forked Cassandra 0.5 for that?
>>
>> That's... a strange way to do it.
>>
>> On Wed, Apr 14, 2010 at 9:36 PM, Jeff Zhang  wrote:
>> > We are currently doing such things, and now we are still at the start
>> > stage.
>> > Currently we only plan to store small files. For large files, splitting
>> > to
>> > small blocks is really one of our options.
>> > You can check out from here http://code.google.com/p/cassandra-fs/
>> >
>> > Document for this project is lack now, but still welcome any feedback
>> > and
>> > contribution.
>> >
>> >
>> >
>> > On Wed, Apr 14, 2010 at 7:32 PM, Miguel Verde 
>> > wrote:
>> >>
>> >> On Wed, Apr 14, 2010 at 9:26 PM, Avinash Lakshman
>> >>  wrote:
>> >>>
>> >>> OPP is not required here. You would be better off using a Random
>> >>> partitioner because you want to get a random distribution of the
>> >>> metadata.
>> >>
>> >>
>> >> Not required, certainly.  However, it strikes me that 1 cluster is
>> >> better
>> >> than 2, and most consumers of a filesystem would expect to be able to
>> >> get an
>> >> ordered listing or tree of the metadata which is easy using the OPP row
>> >> key
>> >> pattern listed previously.  You could still do this with the Random
>> >> partitioner using column names in rows to describe the structure but
>> >> the
>> >> current compaction limitations could be an issue if a branch becomes
>> >> too
>> >> large, and you'd still have a root row hotspot (at least in the schema
>> >> which
>> >> comes to mind).
>> >
>> >
>> > --
>> > Best Regards
>> >
>> > Jeff Zhang
>> >
>
>
>
> --
> Best Regards
>
> Jeff Zhang
>


Re: framed transport

2010-04-15 Thread Nathan McCall
FWIW, We just exposed this as an option in hector.

-Nate

On Thu, Apr 15, 2010 at 8:38 AM, Miguel Verde  wrote:
> On Thu, Apr 15, 2010 at 10:22 AM, Eric Evans  wrote:
>>
>> But, if you've enabled framing on the server, you will not
>> be able to use C# clients (last I checked, there was no framed transport
>> for C#).
>
>
> There *are* many clients that don't have framed transports, but the C#
> client had it added in November:
> https://issues.apache.org/jira/browse/THRIFT-210


Re: Range scan performance in 0.6.0 beta2

2010-03-25 Thread Nathan McCall
I noticed you turned Key caching off in your ColumnFamily declaration,
have you tried experimenting with this on and playing key caching
configuration? Also, have you looked at the JMX output for what
commands are pending execution? That is always helpful to me in
hunting down bottlenecks.

-Nate

On Thu, Mar 25, 2010 at 9:31 AM, Henrik Schröder  wrote:
> On Thu, Mar 25, 2010 at 15:17, Sylvain Lebresne  wrote:
>>
>> I don't know If that could play any role, but if ever you have
>> disabled the assertions
>> when running cassandra (that is, you removed the -ea line in
>> cassandra.in.sh), there
>> was a bug in 0.6beta2 that will make read in row with lots of columns
>> quite slow.
>
> We tried it with beta3 and got the same results, so that didn't do anything.
>
>>
>> Another problem you may have is if you have the commitLog directory on the
>> same
>> hard drive than the data directory. If that's the case and you read
>> and write at the
>> same time, that may be a reason for poor read performances (and write
>> too).
>
> We also tested doing only reads, and got about the same read speeds
>
>>
>> As for the row with 30 millions columns, you have to be aware that right
>> now,
>> cassandra will deserialize whole rows during compaction
>> (http://wiki.apache.org/cassandra/CassandraLimitations).
>> So depending on the size of what you store in you column, you could
>> very well hit
>> that limitation (that could be why you OOM). In which case, I see two
>> choices:
>> 1) add more RAM to the machine or 2) change your data structure to
>> avoid that (maybe
>> can you split rows with too many columns somehow ?).
>
> Splitting the rows would be an option if we got anything near decent speed
> for small rows, but even if we only have a few hundred thousand columns in
> one row, the read speed is still slow.
>
> What kind of numbers are common for this type of operation? Say that you
> have a row with 50 columns whose names range from 0x0 to 0x7A120, and
> you do get_slice operations on that with ranges of random numbers in the
> interval but with a fixed count of 1000, and that you multithread it with
> ~10 of threads, can't you get more than 50 reads/s?
>
> When we've been reading up on Cassandra we've seen posts that billions of
> columns in a row shouldn't be a problem, and sure enough, writing all that
> data goes pretty fast, but as soon as you want to retrieve it, it is really
> slow. We also tried doing counts on the number of columns in a row, and that
> was really, really slow, it took half a minute to count the columns in a row
> with 50 columns, and when doing the same on a row with millions, it just
> crashed with an OOM exception after a few minutes.
>
>
> /Henrik
>


Re: 'Tearing down' a test database

2010-03-24 Thread Nathan McCall
Take a look at EmbeddedServerHelper in the hector client for an
example of how this is managed through a test case:
http://github.com/rantav/hector/blob/master/src/test/java/me/prettyprint/cassandra/testutils/EmbeddedServerHelper.java

-Nate

On Wed, Mar 24, 2010 at 5:10 AM, Philip Jackson  wrote:
> Hi,
>
> Just trying out Cassandra (0.5), looks great so far but I've got a
> question about removing data:
>
> For my test suite I would like to be able to build-up data in the
> database and then have the test framework tear it all back down
> again. Trouble is, if I do a batch_insert, remove, batch_insert (of
> the same data) the second insert doesn't seem to work (though there's
> no exception). It's very possibly me doing something foolish, but does
> anyone else work with this sort of setup? Perhaps my problem sounds
> familiar?
>
> Cheers,
> Phil
>
> P.S The test instances will only ever be on one node and the data in
> them unimportant.
>


Re: Digg's data model

2010-03-19 Thread Nathan McCall
Gary,
Did you see this larticle linked from the Cassandra wiki?
http://about.digg.com/node/564

See http://wiki.apache.org/cassandra/ArticlesAndPresentations for more
examples like the above. In general, you structure your data according
to how it will be queried. This can lead to duplication, but that is
one of the trade-offs for performance and scale.

Digg folks - the "Looking to the Future with Cassandra" linked on the
wiki is no longer available. I found that article quite helpful
originally. Is there a chance this could be re-posted?

Cheers,
-Nate

On Fri, Mar 19, 2010 at 12:16 PM, Gary  wrote:
> I am a newbie to bigtable like model and have a question as follows. Take
> Digg as an example, I want to find a list users who dug a URL and also want
> to find a list of URLs a user dug. How should the data model look like for
> the queries to be efficient? If I use the username and the URL for two rows,
> when a user digs a URL, I will have to update two rows so I need a
> transaction to keep data consistent.
> Any thoughts?
> Thanks,
> Gary


Re: cassandra not responding

2010-03-16 Thread Nathan McCall
The cache is a "second-chance FIFO" from this library:
http://code.google.com/p/concurrentlinkedhashmap/source/browse/trunk/src/java/com/reardencommerce/kernel/collections/shared/evictable/ConcurrentLinkedHashMap.java

That sounds like an awful lot of churn given the size of the queue and
the number of references it might keep for the second-chance stuff.
How big of a hot data set do you need to maintain? The amount of
overhead for such a large record set may not buy you anything over
just relying on the file system cache and turning down the heap size.

-Nate

On Tue, Mar 16, 2010 at 1:17 PM, B. Todd Burruss  wrote:
> i think i better make sure i understand how the row/key cache works.  i
> currently have both set to 10%.  so if cassandra needs to read data from an
> sstable that has 100 million rows, it will cache 10,000,000 rows of data
> from that sstable?  so if my row is ~4k, then we're looking at ~40gb used by
> cache?
>