Re: Usage Pattern : "unique" value of a key.

2011-01-12 Thread Oleg Anastasyev
Benoit Perroud  noisette.ch> writes:

> 
> My idea to solve such use case is to have both thread writing the
> username, but with a colum like "lock-", and then read
> the row, and find out if the first lock column appearing belong to the
> thread. If this is the case, it can continue the process, otherwise it
> has been preempted by another thread.

This looks ok for this task. As an alternative you can avoid creating extra
\lock-random value' column and compare timestamps of new user data you just
written. It is unlikely that both racing threads will have exactly the same
microsecond timestamp at the moment of creating a new user - so if data you read
have exactly the same timestamp you used to write data - this is your data.

Another possible way is to use some external lock coordinator, eg zookeeper.
Although for this task it looks a bit overkill, but this can become even more
valuable, if you have more data concurrency issues to solve and can bear extra
5-10ms update operations latency.



Old data not indexed

2011-01-12 Thread Tan Yeh Zheng
I tried to run the example on
http://www.riptano.com/blog/whats-new-cassandra-07-secondary-indexes
programatically. 

After I index the column "state", I tried to get_indexed_slices (where
state = 'UT') but it returned an empty list. But if I index first, then
query, it'll return the correct result. Any advise is appreciated.

Thanks.



RE: Advice wanted on modeling

2011-01-12 Thread Steven Mac

> Date: Thu, 13 Jan 2011 01:29:33 +0100
> Subject: Re: Advice wanted on modeling
> From: peter.schul...@infidyne.com
> To: user@cassandra.apache.org
> 
> > The application will have a large number of records, with the records
> > consisting of a fixed part and a number (n) of periodic parts.
> > * The fixed part is updated occasionally.
> > * The periodic parts are never updated, but a new one is added every 5 to 10
> > minutes. Only the last n periodic parts need to be kept, so that the oldest
> > one can be deleted after adding a new part.
> > * The records will always be read completely (meaning fixed part and all
> > periodic parts). Reads are less frequent than writes.
> > The application will be running continuosly, at least for a few weeks, so
> > there will be many, many stale periodic parts, so I'm a bit worried about
> > data comsumption and compactions.
> 
> I was going to hit send on a partial recommendation but realized I
> don't really have enough information given that you seem to be making
> pretty specific optimizations.
> 
> You say writes are more frequent than reads. To what extent - are
> reads *very* infrequent to the point that the performance of the reads
> are almost completely irrelevant?

What exactly is a write? Is a record update or is it a batch of record updates
that is executed in one operation? In my case I'm batching about a thousand
record updates (new periodic parts) into a single batch_mutate. A read would
constitute fetching all parts of a single record. In the text below I'm using 
the
term update to mean a record update.

I expect about a few reads typically for every thousand updates (<1%), although
read pressure will vary considerably over time. I don't expect more than a 
hundred
reads for every thousand updates (about 10%). Read performance is not 
irrelevant,
but definitely subordinate to write performance, which is crucial (and one of 
the
reasons I selected Cassandra).

> You seem worried about tombstones and data size. Is the issue that
> you're expecting huge amounts of data and disk space/compaction
> frequency is an issue?

Yes, I am expecting huge amounts of data and without compaction I would
soon (few days to a week) run out of disk space.

> Are you expecting write load to be high such that performance of
> writes (and compaction) is a concern, or is it mostly about slowly
> building up huge amounts of data that you want to be compact on disk?

I'm not sure here. My write load is high, estimated at a thousand records
per second (batched, of course).
  

RE: about the data directory

2011-01-12 Thread Viktor Jevdokimov
>I have 4 nodes, then I  I create one keyspace (such as FOO) with replica 
>factor =1 and insert an data,
> why I can see the directory of /var/lib/Cassandra/data/FOO in every nodes? As 
> I know, I just have one replica

So why do you have installed 4 nodes, not 1?

They're for your data to be distributed between 4 nodes with 1 copy on one of 
them. This is like you have 100% of data and each node will have 25% of the 
data (random partitioning).


Viktor.

 Best regards/ Pagarbiai



Viktor Jevdokimov

Senior Developer



Email: viktor.jevdoki...@adform.com

Phone: +370 5 212 3063

Fax: +370 5 261 0453



Konstitucijos pr. 23,

LT-08105 Vilnius,

Lithuania






[cid:signature-logo6784.png]


Disclaimer: The information contained in this message and attachments is 
intended solely for the attention and use of the named addressee and may be 
confidential. If you are not the interested recipient, you are reminded that 
the information remains the property of the sender. You must not use, disclose, 
distribute, copy, print or rely on this e-mail. If you have received this 
message in error, please contact the sender immediately and irrevocably delete 
or destroy this message and any copies.

<>

about the data directory

2011-01-12 Thread raoyixuan (Shandy)
I have 4 nodes, then I  I create one keyspace (such as FOO) with replica factor 
=1 and insert an data, why I can see the directory of 
/var/lib/Cassandra/data/FOO in every nodes? As I know, I just have one replica

华为技术有限公司 Huawei Technologies Co., Ltd.[Company_logo]




Phone: 28358610
Mobile: 13425182943
Email: raoyix...@huawei.com
地址:深圳市龙岗区坂田华为基地 邮编:518129
Huawei Technologies Co., Ltd.
Bantian, Longgang District,Shenzhen 518129, P.R.China
http://www.huawei.com

本邮件及其附件含有华为公司的保密信息,仅限于发送给上面地址中列出的个人或群组。禁
止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、或散发)本邮件中
的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本邮件!
This e-mail and its attachments contain confidential information from HUAWEI, 
which
is intended only for the person or entity whose address is listed above. Any 
use of the
information contained herein in any way (including, but not limited to, total 
or partial
disclosure, reproduction, or dissemination) by persons other than the intended
recipient(s) is prohibited. If you receive this e-mail in error, please notify 
the sender by
phone or email immediately and delete it!

<>

RE: about the insert data

2011-01-12 Thread raoyixuan (Shandy)
Thanks , I totally get it.

From: Tyler Hobbs [mailto:ty...@riptano.com]
Sent: Thursday, January 13, 2011 2:19 PM
To: user@cassandra.apache.org
Subject: Re: about the insert data

The coordinator node routes the request in parallel to all of the replicas and 
waits for responses.  One of those replicas might happen to be the coordinator 
itself.

Only replicas read/write data they are responsible for, not the coordinator 
(unless the coordinator is also a replica for that data).

- Tyler
On Thu, Jan 13, 2011 at 12:07 AM, raoyixuan (Shandy) 
mailto:raoyix...@huawei.com>> wrote:
I mean whether both the coordinate node and the replica node keep the insert 
data. Or just the replica node keep the insert data. And the coordinate node 
just route the insert data to the replica. Can you get me?

-Original Message-
From: Jonathan Ellis [mailto:jbel...@gmail.com]
Sent: Thursday, January 13, 2011 1:56 PM
To: user
Subject: Re: about the insert data
On Wed, Jan 12, 2011 at 5:46 PM, raoyixuan (Shandy)
mailto:raoyix...@huawei.com>> wrote:
> So you mean the coordinator node is just responsible for routing the request.

Right.  Of course, if the coordinator node happens to also be a
replica, it can be a little more efficient by performing that
operation directly rather than going over a socket.

> where the request will be Routed? whether the coordinator node route the 
> request to the first replica to insert the data?

I don't understand the question.

--
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com



Re: Should nodetool ring give equal load ?

2011-01-12 Thread mck
On Wed, 2011-01-12 at 14:21 -0800, Ryan King wrote:
> What consistency level did you use to write the
> data? 

R=1,W=1 (reads happen a long time afterwards).

~mck

-- 
"It is now quite lawful for a Catholic woman to avoid pregnancy by a
resort to mathematics, though she is still forbidden to resort to
physics and chemistry." H.L. Mencken | www.semb.wever.org | www.sesat.no
| www.finn.no | http://xss-http-filter.sf.net


signature.asc
Description: This is a digitally signed message part


Re: about the insert data

2011-01-12 Thread Tyler Hobbs
The coordinator node routes the request in parallel to all of the replicas
and waits for responses.  One of those replicas might happen to be the
coordinator itself.

Only replicas read/write data they are responsible for, not the coordinator
(unless the coordinator is also a replica for that data).

- Tyler

On Thu, Jan 13, 2011 at 12:07 AM, raoyixuan (Shandy)
wrote:

> I mean whether both the coordinate node and the replica node keep the
> insert data. Or just the replica node keep the insert data. And the
> coordinate node just route the insert data to the replica. Can you get me?
>
> -Original Message-
> From: Jonathan Ellis [mailto:jbel...@gmail.com]
> Sent: Thursday, January 13, 2011 1:56 PM
> To: user
> Subject: Re: about the insert data
>
> On Wed, Jan 12, 2011 at 5:46 PM, raoyixuan (Shandy)
>  wrote:
> > So you mean the coordinator node is just responsible for routing the
> request.
>
> Right.  Of course, if the coordinator node happens to also be a
> replica, it can be a little more efficient by performing that
> operation directly rather than going over a socket.
>
> > where the request will be Routed? whether the coordinator node route the
> request to the first replica to insert the data?
>
> I don't understand the question.
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of Riptano, the source for professional Cassandra support
> http://riptano.com
>


RE: about the insert data

2011-01-12 Thread raoyixuan (Shandy)
I mean whether both the coordinate node and the replica node keep the insert 
data. Or just the replica node keep the insert data. And the coordinate node 
just route the insert data to the replica. Can you get me?

-Original Message-
From: Jonathan Ellis [mailto:jbel...@gmail.com] 
Sent: Thursday, January 13, 2011 1:56 PM
To: user
Subject: Re: about the insert data

On Wed, Jan 12, 2011 at 5:46 PM, raoyixuan (Shandy)
 wrote:
> So you mean the coordinator node is just responsible for routing the request.

Right.  Of course, if the coordinator node happens to also be a
replica, it can be a little more efficient by performing that
operation directly rather than going over a socket.

> where the request will be Routed? whether the coordinator node route the 
> request to the first replica to insert the data?

I don't understand the question.

-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com


Re: about the insert data

2011-01-12 Thread Jonathan Ellis
On Wed, Jan 12, 2011 at 5:46 PM, raoyixuan (Shandy)
 wrote:
> So you mean the coordinator node is just responsible for routing the request.

Right.  Of course, if the coordinator node happens to also be a
replica, it can be a little more efficient by performing that
operation directly rather than going over a socket.

> where the request will be Routed? whether the coordinator node route the 
> request to the first replica to insert the data?

I don't understand the question.

-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com


Re: about the write consistency

2011-01-12 Thread Brandon Williams
2011/1/12 raoyixuan (Shandy) 

>  if I have 20 nodes, and replica factor is 3, whether all the node have
> the replica finally or just have 3 replica?
>

3.

-Brandon


about the write consistency

2011-01-12 Thread raoyixuan (Shandy)
if I have 20 nodes, and replica factor is 3, whether all the node have the 
replica finally or just have 3 replica?

华为技术有限公司 Huawei Technologies Co., Ltd.[Company_logo]




Phone: 28358610
Mobile: 13425182943
Email: raoyix...@huawei.com
地址:深圳市龙岗区坂田华为基地 邮编:518129
Huawei Technologies Co., Ltd.
Bantian, Longgang District,Shenzhen 518129, P.R.China
http://www.huawei.com

本邮件及其附件含有华为公司的保密信息,仅限于发送给上面地址中列出的个人或群组。禁
止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、或散发)本邮件中
的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本邮件!
This e-mail and its attachments contain confidential information from HUAWEI, 
which
is intended only for the person or entity whose address is listed above. Any 
use of the
information contained herein in any way (including, but not limited to, total 
or partial
disclosure, reproduction, or dissemination) by persons other than the intended
recipient(s) is prohibited. If you receive this e-mail in error, please notify 
the sender by
phone or email immediately and delete it!

<>

RE: about the insert data

2011-01-12 Thread raoyixuan (Shandy)
So you mean the coordinator node is just responsible for routing the request. 
where the request will be Routed? whether the coordinator node route the 
request to the first replica to insert the data?

whether 

-Original Message-
From: sc...@scode.org [mailto:sc...@scode.org] On Behalf Of Peter Schuller
Sent: Thursday, January 13, 2011 2:02 AM
To: user@cassandra.apache.org
Subject: Re: about the insert data

> Firstly, the data will be inserted by the coordinate node.
>
> Secondly, it will find the first replica node based by the partitioner ,such 
> randompartitioner,
>
> Thirdly, it will replicate the data based by the replica factor

Replicate placement is entirely independent of which node you talk to.
The one talk to, the co-ordinator node, is responsible for routing the
requests appropriately. The replication strategy decides where data
lives.

As a client, you don't have to worry about which node you're talking
to except for spreading the load out over nodes in some fashion.

--
/ Peter Schuller


Re: Advice wanted on modeling

2011-01-12 Thread Peter Schuller
> The application will have a large number of records, with the records
> consisting of a fixed part and a number (n) of periodic parts.
> * The fixed part is updated occasionally.
> * The periodic parts are never updated, but a new one is added every 5 to 10
> minutes. Only the last n periodic parts need to be kept, so that the oldest
> one can be deleted after adding a new part.
> * The records will always be read completely (meaning fixed part and all
> periodic parts). Reads are less frequent than writes.
> The application will be running continuosly, at least for a few weeks, so
> there will be many, many stale periodic parts, so I'm a bit worried about
> data comsumption and compactions.

I was going to hit send on a partial recommendation but realized I
don't really have enough information given that you seem to be making
pretty specific optimizations.

You say writes are more frequent than reads. To what extent - are
reads *very* infrequent to the point that the performance of the reads
are almost completely irrelevant?

You seem worried about tombstones and data size. Is the issue that
you're expecting huge amounts of data and disk space/compaction
frequency is an issue?

Are you expecting write load to be high such that performance of
writes (and compaction) is a concern, or is it mostly about slowly
building up huge amounts of data that you want to be compact on disk?

-- 
/ Peter Schuller


Re: Node Inconsistency

2011-01-12 Thread Peter Schuller
> We will follow your suggestion and we will run Node Repair tool more
> often in the future. However, what happens to data inserted/deleted
> after Node Repair tool runs (i.e., between Node Repair and Major
> Compaction).

It is handled as you would expect; deletions are propagated across the
cluster etc just like e.g. an overwrite would.

The thing that makes tombstones special is that deletions are
essentially a special case. While normal insertions, over-writes or
not, are fine because given some number of columns there is never an
issue deciding the latest one - the *lack* of a column is problematic
in a distributed system, and the active removal are represented by
these tombstones. If you were willing to store tombstones forever,
they would not be an issue. But typically that would not make sense,
since data that is removed will keep having a performance impact on
the cluster (and take up some disk space). Usually, when you remove
data you want it actually *removed*, so that there is no trace of it
at all. But as soon as you remove the tombstone, you lose track of the
fact that data was removed. So unless you *know* there is no data
somewhere in the cluster for a column, that is older than the
tombstone that indicates it removal, it's not safe to remove.

So, the grace period and the necessity to run nodetool repair is there
for that reason. The periodic nodetool repair is the method by which
you can "know" that there *is* in fact no data somewhere in the
cluster for a column, that is older than the tombstone that indicates
it removal. Hence, the expiry of the tombstones is safe.

-- 
/ Peter Schuller


Re: Should nodetool ring give equal load ?

2011-01-12 Thread Brandon Williams
On Wed, Jan 12, 2011 at 4:08 PM, mck  wrote:

>
> > You're using an ordered partitioner and your nodes are evenly spread
> > around the ring, but your data probably isn't evenly distributed.
>
> This load number seems equals to `du -hs ` and
> since i've got N == RF shouldn't the data size always be the same on
> every node?
>

Maybe you have damaged replicas, try running repair everywhere.

-Brandon


Re: Should nodetool ring give equal load ?

2011-01-12 Thread Ryan King
On Wed, Jan 12, 2011 at 2:08 PM, mck  wrote:
>
>> You're using an ordered partitioner and your nodes are evenly spread
>> around the ring, but your data probably isn't evenly distributed.
>
> This load number seems equals to `du -hs ` and
> since i've got N == RF shouldn't the data size always be the same on
> every node?

Good point. I misread that as RF=2. I'm not sure what's going on here,
but it seems wrong. What consistency level did you use to write the
data?

-ryan


Re: Timeout Errors while running Hadoop over Cassandra

2011-01-12 Thread mck
On Wed, 2011-01-12 at 23:04 +0100, mck wrote:
> > Caused by: TimedOutException()
> 
> What is the exception in the cassandra logs? 

Or tried increasing rpc_timeout_in_ms?

~mck

-- 
"When there is no enemy within, the enemies outside can't hurt you."
African proverb | www.semb.wever.org | www.sesat.no | www.finn.no |
http://xss-http-filter.sf.net


signature.asc
Description: This is a digitally signed message part


Re: Should nodetool ring give equal load ?

2011-01-12 Thread mck

> You're using an ordered partitioner and your nodes are evenly spread
> around the ring, but your data probably isn't evenly distributed. 

This load number seems equals to `du -hs ` and
since i've got N == RF shouldn't the data size always be the same on
every node?

~mck

-- 
"Traveller, there are no paths. Paths are made by walking." Australian
Aboriginal saying | www.semb.wever.org | www.sesat.no | www.finn.no |
http://xss-http-filter.sf.net


signature.asc
Description: This is a digitally signed message part


Re: Timeout Errors while running Hadoop over Cassandra

2011-01-12 Thread mck
On Wed, 2011-01-12 at 18:40 +, Jairam Chandar wrote:
> Caused by: TimedOutException()

What is the exception in the cassandra logs?

~mck

-- 
"Don't use Outlook. Outlook is really just a security hole with a small
e-mail client attached to it." Brian Trosko | www.semb.wever.org |
www.sesat.no | www.finn.no | http://xss-http-filter.sf.net


signature.asc
Description: This is a digitally signed message part


Re: Timeout Errors while running Hadoop over Cassandra

2011-01-12 Thread Aaron Morton
Whats happening in the cassandra server logs when you get these errors? Reading through the hadoop 0.6.6 code it looks like it creates a thrift client with an infinite timeout. So it may be an internode timeout, which is set in storage-conf.xml.AaronOn 13 Jan, 2011,at 07:40 AM, Jairam Chandar  wrote:Hi folks,We have a Cassandra 0.6.6 cluster running in production. We want to run Hadoop (version 0.20.2) jobs over this cluster in order to generate reports. I modified the word_count example in the contrib folder of the cassandra distribution. While the program is running fine for small datasets (in the order of 100-200 MB) on small clusters (2 machines), it starts to give errors while trying to run on a bigger cluster (5 machines) with much larger dataset (400 GB). Here is the error that we get - 
java.lang.RuntimeException: TimedOutException()
	at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.maybeInit(ColumnFamilyRecordReader.java:186)
	at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNext(ColumnFamilyRecordReader.java:236)
	at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNext(ColumnFamilyRecordReader.java:104)
	at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:135)
	at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:130)
	at org.apache.cassandra.hadoop.ColumnFamilyRecordReader.nextKeyValue(ColumnFamilyRecordReader.java:98)
	at org.apachehadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:423)
	at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
	at org.apache.hadoop.mapred.Child.main(Child.java:170)
Caused by: TimedOutException()
	at org.apache.cassandra.thrift.Cassandra$get_range_slices_result.read(Cassandra.java:11094)
	at org.apache.cassandra.thrift.Cassandra$Client.recv_get_range_slices(Cassandra.java:628)
	at org.apache.cassandra.thrift.Cassandra$Client.get_range_slices(Cassandra.java:602)
	at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.maybeInit(ColumnFamilyRecordReader.java:164)
	... 11 more
I came across this page on the Cassandra wiki - http://wiki.apache.org/cassandra/HadoopSupport and tried modifying the ulimit and changing batch sizes. These did not help. Though the number of successful map tasks increased, it eventually fails since the total number of map tasks is huge. 
Any idea on what could be causing this? The program we are running is a very slight modification of the word_count example with respect to reading from Cassandra. The only change being specific keyspace, columnfamily and columns. The rest of the code for reading is the same as the word_count example in the source code for Cassandra 0.6.6.
Thanks and regards,Jairam Chandar


Re: Should nodetool ring give equal load ?

2011-01-12 Thread Ryan King
On Wed, Jan 12, 2011 at 2:00 PM, mck  wrote:
> I'm using 0.7.0-rc3, 3 nodes, RF=3, and ByteOrderedPartitioner.
>
> When i run "nodetool ring" it reports
>
>> Address         Status State   Load            Owns    Token
>>                                                        
>> Token(bytes[ff034355152567a5b2d962b55990e692])
>> 152.90.242.91   Up     Normal  12.26 GB        33.33%  
>> Token(bytes[01cecd88847283229a3dc88292deff86])
>> 152.90.242.93   Up     Normal  6.13 GB         33.33%  
>> Token(bytes[d4a4de25c0dad34749e99219e227d896])
>> 152.90.242.92   Up     Normal  6.13 GB         33.33%  
>> Token(bytes[ff034355152567a5b2d962b55990e692])
>
> why would the first node have double the Load?
> is this expected or is something wrong?
>
> Number of files data_file_directories for the keyspace is roughly the same.
> But each Index and Filter file is double the size on the first node 
> (regardless of the cf they belong to).
>
> "cleanup" didn't help. "compact" only took away 2GB. Otherwise there is a lot 
> here i don't understand.

You're using an ordered partitioner and your nodes are evenly spread
around the ring, but your data probably isn't evenly distributed.

-ryan


Should nodetool ring give equal load ?

2011-01-12 Thread mck
I'm using 0.7.0-rc3, 3 nodes, RF=3, and ByteOrderedPartitioner.

When i run "nodetool ring" it reports

> Address Status State   LoadOwnsToken  
>  
>
> Token(bytes[ff034355152567a5b2d962b55990e692])
> 152.90.242.91   Up Normal  12.26 GB33.33%  
> Token(bytes[01cecd88847283229a3dc88292deff86])
> 152.90.242.93   Up Normal  6.13 GB 33.33%  
> Token(bytes[d4a4de25c0dad34749e99219e227d896])
> 152.90.242.92   Up Normal  6.13 GB 33.33%  
> Token(bytes[ff034355152567a5b2d962b55990e692])

why would the first node have double the Load?
is this expected or is something wrong?

Number of files data_file_directories for the keyspace is roughly the same.
But each Index and Filter file is double the size on the first node (regardless 
of the cf they belong to).

"cleanup" didn't help. "compact" only took away 2GB. Otherwise there is a lot 
here i don't understand.


~mck

-- 
"The turtle only makes progress when it's neck is stuck out" Rollo May |
www.semb.wever.org | www.sesat.no | www.finn.no |
http://xss-http-filter.sf.net


signature.asc
Description: This is a digitally signed message part


Re: unsubscribe

2011-01-12 Thread Robert Coli
On Tue, Jan 11, 2011 at 10:29 PM, Nichole Kulobone
 wrote:
>

http://wiki.apache.org/cassandra/FAQ#unsubscribe

=Rob


Re: best way to do a count

2011-01-12 Thread Aaron Morton
There is a get_count() API function http://wiki.apache.org/cassandra/API , it's going to count the columns in a row or row+super column. This function is available in me.prettyprint.cassandra.service.KeyspaceService.There are distributed counters submitted to the trunk http://wiki.apache.org/cassandra/Counters but these are not in the recent 0.7 release. I lost track of things over the holidays, perhaps someone else knows when these are scheduled to go public. Aaron On 13 Jan, 2011,at 09:12 AM, Michael Fortin  wrote:I was working on a schema that looks something like this:

HitFamily [UUID 1] ['user-agent'] = '…'
HitFamily [UUID 1] ['referer'] = '…'
HitFamily [UUID 1] ['client_id'] = Long
…

HitCountFamily [client_id as Long] [Current Date as Long] = UUID1


What I'd like to do is count the columns between a date rage without returning them.  Is it possible to get a count of rows in a slice?  Looking at hector and thrift there doesn't seem to be a way to do that.  How have other handled this?

Thanks,



best way to do a count

2011-01-12 Thread Michael Fortin
I was working on a schema that looks something like this:

HitFamily [UUID 1] ['user-agent'] = '…'
HitFamily [UUID 1] ['referer'] = '…'
HitFamily [UUID 1] ['client_id'] = Long
…

HitCountFamily [client_id as Long] [Current Date as Long] = UUID1


What I'd like to do is count the columns between a date rage without returning 
them.  Is it possible to get a count of rows in a slice?  Looking at hector and 
thrift there doesn't seem to be a way to do that.  How have other handled this?

Thanks,



Timeout Errors while running Hadoop over Cassandra

2011-01-12 Thread Jairam Chandar
Hi folks,

We have a Cassandra 0.6.6 cluster running in production. We want to run
Hadoop (version 0.20.2) jobs over this cluster in order to generate
reports.
I modified the word_count example in the contrib folder of the cassandra
distribution. While the program is running fine for small datasets (in the
order of 100-200 MB) on small clusters (2 machines), it starts to give
errors while trying to run on a bigger cluster (5 machines) with much larger
dataset (400 GB). Here is the error that we get -

java.lang.RuntimeException: TimedOutException()
at 
org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.maybeInit(ColumnFamilyRecordReader.java:186)
at 
org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNext(ColumnFamilyRecordReader.java:236)
at 
org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNext(ColumnFamilyRecordReader.java:104)
at 
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:135)
at 
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:130)
at 
org.apache.cassandra.hadoop.ColumnFamilyRecordReader.nextKeyValue(ColumnFamilyRecordReader.java:98)
at 
org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:423)
at 
org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
at org.apache.hadoop.mapred.Child.main(Child.java:170)
Caused by: TimedOutException()
at 
org.apache.cassandra.thrift.Cassandra$get_range_slices_result.read(Cassandra.java:11094)
at 
org.apache.cassandra.thrift.Cassandra$Client.recv_get_range_slices(Cassandra.java:628)
at 
org.apache.cassandra.thrift.Cassandra$Client.get_range_slices(Cassandra.java:602)
at 
org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.maybeInit(ColumnFamilyRecordReader.java:164)
... 11 more




I came across this page on the Cassandra wiki -
http://wiki.apache.org/cassandra/HadoopSupport and tried modifying the
ulimit and changing batch sizes. These did not help. Though the number of
successful map tasks increased, it eventually fails since the total number
of map tasks is huge.

Any idea on what could be causing this? The program we are running is a very
slight modification of the word_count example with respect to reading from
Cassandra. The only change being specific keyspace, columnfamily and
columns. The rest of the code for reading is the same as the word_count
example in the source code for Cassandra 0.6.6.

Thanks and regards,
Jairam Chandar


Re: Why my posts are marked as spam?

2011-01-12 Thread Oleg Tsvinev
Created: https://issues.apache.org/jira/browse/INFRA-3356

On Wed, Jan 12, 2011 at 9:25 AM, zGreenfelder wrote:

> On Wed, Jan 12, 2011 at 11:39 AM, Oleg Tsvinev 
> wrote:
> > I'm sending it from my GMail account. I'm opening a new topic, which
> rules
> > out top-posting.
> > The message had mixed fonts in it, that might be a problem.
> > Here's what I'm getting from GMail while sending the message in question:
> > Technical details of permanent failure:
> > Google tried to deliver your message, but it was rejected by the
> recipient
> > domain. We recommend contacting the other email provider for further
> > information about the cause of this error. The error that the other
> server
> > returned was: 552 552 spam score (5.1) exceeded threshold (state 18).
> > And I be damned if I spam. Time to tweak some filters, eh?
>
>
> I didn't mean to suggest top posts were flagged as spam or otherwise
> rejected in principle.   it was more of a 'oh, by the way, you might
> not want to do that' item.
>
>
> --
> Even the Magic 8 ball has an opinion on email clients: Outlook not so good.
>


Re: about the insert data

2011-01-12 Thread Peter Schuller
> Firstly, the data will be inserted by the coordinate node.
>
> Secondly, it will find the first replica node based by the partitioner ,such 
> randompartitioner,
>
> Thirdly, it will replicate the data based by the replica factor

Replicate placement is entirely independent of which node you talk to.
The one talk to, the co-ordinator node, is responsible for routing the
requests appropriately. The replication strategy decides where data
lives.

As a client, you don't have to worry about which node you're talking
to except for spreading the load out over nodes in some fashion.

--
/ Peter Schuller


Re: Why my posts are marked as spam?

2011-01-12 Thread zGreenfelder
On Wed, Jan 12, 2011 at 11:39 AM, Oleg Tsvinev  wrote:
> I'm sending it from my GMail account. I'm opening a new topic, which rules
> out top-posting.
> The message had mixed fonts in it, that might be a problem.
> Here's what I'm getting from GMail while sending the message in question:
> Technical details of permanent failure:
> Google tried to deliver your message, but it was rejected by the recipient
> domain. We recommend contacting the other email provider for further
> information about the cause of this error. The error that the other server
> returned was: 552 552 spam score (5.1) exceeded threshold (state 18).
> And I be damned if I spam. Time to tweak some filters, eh?


I didn't mean to suggest top posts were flagged as spam or otherwise
rejected in principle.   it was more of a 'oh, by the way, you might
not want to do that' item.


-- 
Even the Magic 8 ball has an opinion on email clients: Outlook not so good.


Re: Why my posts are marked as spam?

2011-01-12 Thread Eric Evans
On Wed, 2011-01-12 at 09:09 -0800, Oleg Tsvinev wrote:
> Which component? Mail Archives or Mail (qmail)?

Mail would be my guess.

-- 
Eric Evans
eev...@rackspace.com



Re: Why my posts are marked as spam?

2011-01-12 Thread Oleg Tsvinev
Which component? Mail Archives or Mail (qmail)?

On Wed, Jan 12, 2011 at 9:06 AM, Eric Evans  wrote:

> On Wed, 2011-01-12 at 08:39 -0800, Oleg Tsvinev wrote:
> >
> > And I be damned if I spam. Time to tweak some filters, eh?
>
> Maybe so.  We don't have any control over that though I'm afraid.  Can
> you submit a ticket to INFRA?
>
> https://issues.apache.org/jira/browse/INFRA
>
> > On Wed, Jan 12, 2011 at 8:17 AM, Eric Evans 
> > wrote:
> >
> > > On Wed, 2011-01-12 at 16:46 +0200, David Boxenhorn wrote:
> > > > What's wrong with topposting?
> > > >
> > > > This email is non-plain and topposted...
> > >
> > > Because a little piece of me dies every time you do.
> --
> Eric Evans
> eev...@rackspace.com
>
>


Re: Why my posts are marked as spam?

2011-01-12 Thread Eric Evans
On Wed, 2011-01-12 at 08:39 -0800, Oleg Tsvinev wrote:
> 
> And I be damned if I spam. Time to tweak some filters, eh?

Maybe so.  We don't have any control over that though I'm afraid.  Can
you submit a ticket to INFRA?

https://issues.apache.org/jira/browse/INFRA

> On Wed, Jan 12, 2011 at 8:17 AM, Eric Evans 
> wrote:
> 
> > On Wed, 2011-01-12 at 16:46 +0200, David Boxenhorn wrote:
> > > What's wrong with topposting?
> > >
> > > This email is non-plain and topposted...
> >
> > Because a little piece of me dies every time you do. 
-- 
Eric Evans
eev...@rackspace.com



Re: Why my posts are marked as spam?

2011-01-12 Thread Oleg Tsvinev
I'm sending it from my GMail account. I'm opening a new topic, which rules
out top-posting.
The message had mixed fonts in it, that might be a problem.
Here's what I'm getting from GMail while sending the message in question:

Technical details of permanent failure:
Google tried to deliver your message, but it was rejected by the recipient
domain. We recommend contacting the other email provider for further
information about the cause of this error. The error that the other server
returned was: 552 552 spam score (5.1) exceeded threshold (state 18).

And I be damned if I spam. Time to tweak some filters, eh?

On Wed, Jan 12, 2011 at 8:17 AM, Eric Evans  wrote:

> On Wed, 2011-01-12 at 16:46 +0200, David Boxenhorn wrote:
> > What's wrong with topposting?
> >
> > This email is non-plain and topposted...
>
> Because a little piece of me dies every time you do.
>
> --
> Eric Evans
> eev...@rackspace.com
>
>


Re: Why my posts are marked as spam?

2011-01-12 Thread Eric Evans
On Wed, 2011-01-12 at 16:46 +0200, David Boxenhorn wrote:
> What's wrong with topposting?
> 
> This email is non-plain and topposted... 

Because a little piece of me dies every time you do.

-- 
Eric Evans
eev...@rackspace.com



Re: Why my posts are marked as spam?

2011-01-12 Thread zGreenfelder
On Wed, Jan 12, 2011 at 9:46 AM, David Boxenhorn  wrote:
> What's wrong with topposting?
>
> This email is non-plain and topposted...
>

I suspect your origin domain (lookin2.com) gets tagged less often by
spam assassin (or whatever the moral equivalent being used for this
list may be) and the limits set on your mail content are more lax than
they would be from a gmail.com account.

it's all speculation on my part, and I could be completely off.   but
as I said before, it seems to be the way to get through if being
blocked on lists.   or it always has been for me.


-- 
Even the Magic 8 ball has an opinion on email clients: Outlook not so good.


Usage Pattern : "unique" value of a key.

2011-01-12 Thread Benoit Perroud
Hi ML,

I wonder if someone has already experiment some kind of unique index
on a column family key.

Let's go for a short example : the key is the username. What happens
if 2 users want to signup at the same time with the same username ?

So has someone already addressed this "pattern" in Cassandra point of view ?

My idea to solve such use case is to have both thread writing the
username, but with a colum like "lock-", and then read
the row, and find out if the first lock column appearing belong to the
thread. If this is the case, it can continue the process, otherwise it
has been preempted by another thread.

Has someone another idea to share ?

Thanks in advance,

Kind regards,

Benoit.


Re: Why my posts are marked as spam?

2011-01-12 Thread Sven Johansson
On Wed, Jan 12, 2011 at 3:46 PM, David Boxenhorn  wrote:

> What's wrong with topposting?
>
>
"A: Because it's counterintuitive to the way we read.
Q: Why is top-posting bad?"

...and because it disregards context and makes a thread harder to follow.

-- 
Sven Johansson
Twitter: @svjson


Re: Why my posts are marked as spam?

2011-01-12 Thread David Boxenhorn
What's wrong with topposting?

This email is non-plain and topposted...

On Wed, Jan 12, 2011 at 4:32 PM, zGreenfelder wrote:

> >
> > On 12 January 2011 05:28, Oleg Tsvinev  wrote:
> > > Whatever I do, it happens :(
> >On Wed, Jan 12, 2011 at 1:53 AM, Arijit Mukherjee 
> wrote:
> >
> > I think this happens for RTF. Some of the mails in the post are RTF,
> > and the reply button creates an RTF reply - that's when it happens.
> > Wonder how the mail to which I replied was in RTF...
> >
> > Arijit
> >
> >
> > --
> > "And when the night is cloudy,
> > There is still a light that shines on me,
> > Shine on until tomorrow, let it be."
>
> I think it happens for any non-plain text.. be it RTF, HTML, or
> whatever.   at least that's been my limited experience with mailing
> lists.
>
> and for what it's worth (I just had to correct myself, so don't take
> this as huge criticism), many people are also opposed to topposting ..
> or adding a reply to the top of an email.   FWIW.
>
> --
> Even the Magic 8 ball has an opinion on email clients: Outlook not so good.
>


Re: Why my posts are marked as spam?

2011-01-12 Thread zGreenfelder
>
> On 12 January 2011 05:28, Oleg Tsvinev  wrote:
> > Whatever I do, it happens :(
>On Wed, Jan 12, 2011 at 1:53 AM, Arijit Mukherjee  wrote:
>
> I think this happens for RTF. Some of the mails in the post are RTF,
> and the reply button creates an RTF reply - that's when it happens.
> Wonder how the mail to which I replied was in RTF...
>
> Arijit
>
>
> --
> "And when the night is cloudy,
> There is still a light that shines on me,
> Shine on until tomorrow, let it be."

I think it happens for any non-plain text.. be it RTF, HTML, or
whatever.   at least that's been my limited experience with mailing
lists.

and for what it's worth (I just had to correct myself, so don't take
this as huge criticism), many people are also opposed to topposting ..
or adding a reply to the top of an email.   FWIW.

--
Even the Magic 8 ball has an opinion on email clients: Outlook not so good.


Re: Reclaim deleted rows space

2011-01-12 Thread David Boxenhorn
I think that if SSTs are partitioned within the node using RP, so that each
partition is small and can be compacted independently of all other
partitions, you can implement an algorithm that will spread out the work of
compaction over time so that it never takes a node out of commission, as it
does now.

I have left a comment here to that effect here:

https://issues.apache.org/jira/browse/CASSANDRA-1608?focusedCommentId=12980654&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12980654

On Mon, Jan 10, 2011 at 10:56 PM, Jonathan Ellis  wrote:

> I'd suggest describing your approach on
> https://issues.apache.org/jira/browse/CASSANDRA-1608, and if it's
> attractive, porting it to 0.8.  It's too late for us to make deep
> changes in 0.6 and probably even 0.7 for the sake of stability.
>
> On Mon, Jan 10, 2011 at 8:00 AM, shimi  wrote:
> > I modified the code to limit the size of the SSTables.
> > I will be glad if someone can take a look at it
> > https://github.com/Shimi/cassandra/tree/cassandra-0.6
> > Shimi
> >
> > On Fri, Jan 7, 2011 at 2:04 AM, Jonathan Shook  wrote:
> >>
> >> I believe the following condition within submitMinorIfNeeded(...)
> >> determines whether to continue, so it's not a hard loop.
> >>
> >> // if (sstables.size() >= minThreshold) ...
> >>
> >>
> >>
> >> On Thu, Jan 6, 2011 at 2:51 AM, shimi  wrote:
> >> > According to the code it make sense.
> >> > submitMinorIfNeeded() calls doCompaction() which
> >> > calls submitMinorIfNeeded().
> >> > With minimumCompactionThreshold = 1 submitMinorIfNeeded() will always
> >> > run
> >> > compaction.
> >> >
> >> > Shimi
> >> > On Thu, Jan 6, 2011 at 10:26 AM, shimi  wrote:
> >> >>
> >> >>
> >> >> On Wed, Jan 5, 2011 at 11:31 PM, Jonathan Ellis 
> >> >> wrote:
> >> >>>
> >> >>> Pretty sure there's logic in there that says "don't bother
> compacting
> >> >>> a single sstable."
> >> >>
> >> >> No. You can do it.
> >> >> Based on the log I have a feeling that it triggers an infinite
> >> >> compaction
> >> >> loop.
> >> >>
> >> >>>
> >> >>> On Wed, Jan 5, 2011 at 2:26 PM, shimi  wrote:
> >> >>> > How does minor compaction is triggered? Is it triggered Only when
> a
> >> >>> > new
> >> >>> > SStable is added?
> >> >>> >
> >> >>> > I was wondering if triggering a compaction
> >> >>> > with minimumCompactionThreshold
> >> >>> > set to 1 would be useful. If this can happen I assume it will do
> >> >>> > compaction
> >> >>> > on files with similar size and remove deleted rows on the rest.
> >> >>> > Shimi
> >> >>> > On Tue, Jan 4, 2011 at 9:56 PM, Peter Schuller
> >> >>> > 
> >> >>> > wrote:
> >> >>> >>
> >> >>> >> > I don't have a problem with disk space. I have a problem with
> the
> >> >>> >> > data
> >> >>> >> > size.
> >> >>> >>
> >> >>> >> [snip]
> >> >>> >>
> >> >>> >> > Bottom line is that I want to reduce the number of requests
> that
> >> >>> >> > goes to
> >> >>> >> > disk. Since there is enough data that is no longer valid I can
> do
> >> >>> >> > it
> >> >>> >> > by
> >> >>> >> > reclaiming the space. The only way to do it is by running Major
> >> >>> >> > compaction.
> >> >>> >> > I can wait and let Cassandra do it for me but then the data
> size
> >> >>> >> > will
> >> >>> >> > get
> >> >>> >> > even bigger and the response time will be worst. I can do it
> >> >>> >> > manually
> >> >>> >> > but I
> >> >>> >> > prefer it to happen in the background with less impact on the
> >> >>> >> > system
> >> >>> >>
> >> >>> >> Ok - that makes perfect sense then. Sorry for misunderstanding :)
> >> >>> >>
> >> >>> >> So essentially, for workloads that are teetering on the edge of
> >> >>> >> cache
> >> >>> >> warmness and is subject to significant overwrites or removals, it
> >> >>> >> may
> >> >>> >> be beneficial to perform much more aggressive background
> compaction
> >> >>> >> even though it might waste lots of CPU, to keep the in-memory
> >> >>> >> working
> >> >>> >> set down.
> >> >>> >>
> >> >>> >> There was talk (I think in the compaction redesign ticket) about
> >> >>> >> potentially improving the use of bloom filters such that obsolete
> >> >>> >> data
> >> >>> >> in sstables could be eliminated from the read set without
> >> >>> >> necessitating actual compaction; that might help address cases
> like
> >> >>> >> these too.
> >> >>> >>
> >> >>> >> I don't think there's a pre-existing silver bullet in a current
> >> >>> >> release; you probably have to live with the need for
> >> >>> >> greater-than-theoretically-optimal memory requirements to keep
> the
> >> >>> >> working set in memory.
> >> >>> >>
> >> >>> >> --
> >> >>> >> / Peter Schuller
> >> >>> >
> >> >>> >
> >> >>>
> >> >>>
> >> >>>
> >> >>> --
> >> >>> Jonathan Ellis
> >> >>> Project Chair, Apache Cassandra
> >> >>> co-founder of Riptano, the source for professional Cassandra support
> >> >>> http://riptano.com
> >> >>
> >> >
> >> >
> >
> >
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of Riptano, the source for profession

Re: how to do a get_range_slices where all keys start with same string

2011-01-12 Thread Stephen Connolly
or set the end key to "com.googlf"

On 12 January 2011 02:49, Aaron Morton  wrote:

> If you were using OPP and get_range_slices then set the start_key to be
> "com.google" and the end_key to be "". Get is slices of say 1,000 (use the
> last key read as the next start_ket) and when you see the first key that
> does not start with com.google top making calls.
>
> If you move the data from rows to columns, you can use the same approach.
>
> Aaron
>
>
> On 12 Jan, 2011,at 03:25 PM, Roshan Dawrani 
> wrote:
>
> On Wed, Jan 12, 2011 at 7:41 AM, Koert Kuipers <
> koert.kuip...@diamondnotch.com> wrote:
>
>> Ok I see get_range_slice is really only useful for paging with RP..
>>
>> So if I were using OPP (which I am not) and I wanted all keys starting
>> with "com.google", what should my start_key and end_key be?
>>
>
> I think you can't. It's the columns that are sorted, and not the rows (if u
> r not using OPP). With your "com.google." data arranged in columns
> instead of rows, you should be able to specify start_col, end_col to filter
> it.
>
>
>
>