Re: About Tombstones and TTLs

2016-12-19 Thread Cody Yancey
>> Cassandra stores hints for the lowest of gc_grace_seconds and
max_hint_window_in_ms

Was this a tough design decision or just a bug? It is certainly very
surprising behavior. Everything that I've read leads me to believe that
gc_grace_seconds was only intended to affect the treatment of *expired*
 data.

Thanks,
Cody

On Mon, Dec 19, 2016 at 8:10 AM, Alain RODRIGUEZ  wrote:

> Hi,
>
>
>>- Why setting gc_grace_seconds=0 will disable hints for the table?
>>
>> It was the first time I heard about this as well when Alexander told us
> about that. This read might be helpful http://www.uberobert.com/
> cassandra_gc_grace_disables_hinted_handoff/. Also Alexander I know tested
> it.
>
> *tl;dr*:  Cassandra stores hints for the lowest of gc_grace_seconds and
> max_hint_window_in_ms
>
> Still I see no reason not to set gc_grace_seconds to 3 hours as a fix /
> workaround. Keeping 3 hours of extra data on disk is something you
> definitely want to be able to do.
>
>
>>- How can an expired TTL record be deleted by Cassandra without
>>tombstoning or compaction? Aren't SSTables immutable files, and expired
>>records are removed through compaction?
>>
>>
> This sounds magical to me as well. The only way I am aware of to drop
> tombstone without compaction is having an entire "SSTable expired" that
> would be soon be evicted, without compactions. TWCS relies on this property
> and make a great use of it. Here is Jeff talk about TWCS:
> https://www.youtube.com/watch?v=PWtekUWCIaw. I believe he mentioned that.
>
>
>>- If I only use TTL for deletion, do I still need gc_grace_seconds to
>>be bigger than 0?
>>
>>
>>- If I only use TTL for deletion, but use updates as well, do I need
>>gc_grace_seconds to be bigger than 0?
>>
>>
> Yes, if you care about hints. Anyway, setting gc_grace_seconds to 0 brings
> more troubles than solutions in many cases. Use the value of
> max_hint_window_in_ms as a minimal gc_grace_seconds (watch out for the time
> units in use, do the math ;-) )
>
> Here is a blog I wrote a few months ago about tombstones and deletes
> http://thelastpickle.com/blog/2016/07/27/about-deletes-and-tombstones.html.
> I hope it will give you interesting insight about tombstones, even if you
> do not care about all the "deletes" part. About TTLs, see
> http://thelastpickle.com/blog/2016/07/27/about-deletes-
> and-tombstones.html#tombstones-drop. There is no need for you to repair
> within gc_grace_seconds, but given that "Cassandra stores hints for the
> lowest of gc_grace_seconds and max_hint_window_in_ms"  I would never use a
> lower value than 3 hours (default  max_hint_window_in_ms) for
> gc_grace_seconds, on any table.
>
> C*heers,
>
>
> 2016-12-19 15:07 GMT+01:00 Shalom Sagges :
>
>> Thanks for the explanation Matija, but fortunately, that I know. Forgot
>> to mention that I'm using a multi DC cluster.
>> I'll try to summarize just the questions I have, because my email was
>> indeed quite long :-)
>>
>>
>>- Why setting gc_grace_seconds=0 will disable hints for the table?
>>- How can an expired TTL record be deleted by Cassandra without
>>tombstoning or compaction? Aren't SSTables immutable files, and expired
>>records are removed through compaction?
>>- If I only use TTL for deletion, do I still need gc_grace_seconds to
>>be bigger than 0?
>>- If I only use TTL for deletion, but use updates as well, do I need
>>gc_grace_seconds to be bigger than 0?
>>
>>
>> Thanks!
>>
>>
>> Shalom Sagges
>> DBA
>> T: +972-74-700-4035 <+972%2074-700-4035>
>>  
>>  We Create Meaningful Connections
>>
>>
>>
>> On Mon, Dec 19, 2016 at 2:39 PM, Matija Gobec 
>> wrote:
>>
>>> Hi,
>>>
>>> gc_grace_seconds is used to maintain data consistency in some failure
>>> scenarios. When manually deleting data that action creates tombstones which
>>> are kept for that defined period before being compacted. If one of the
>>> replica nodes is down while deleting data and it gets back up after the
>>> gc_grace_seconds defined period your previously delete data will reappear
>>> (ghost data). As it is stated in datastax documentation on a single node
>>> you can set gc_grace_seconds to 0 and you can do the same for tables that
>>> contain only data with TTL. In the mentioned failure scenario your downed
>>> node will have data with TTL information and no data inconsistency will
>>> happen.
>>>
>>> On Mon, Dec 19, 2016 at 1:00 PM, Shalom Sagges 
>>> wrote:
>>>
 Hi Everyone,

 I was reading a blog on TWCS by Alex Dejanovski from The Last Pickle (
 http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html)

 When I got to the comments section, I didn't understand why setting
 gc_grace_seconds to 0 will disable hints for the associated table:
 *"It is a very good point that gc_grace_seconds shouldn't be lowered
 too much as its impact on hinted hand

Re: Batch size warnings

2016-12-07 Thread Cody Yancey
There is a disconnect between write.3 and write.4, but it can only affect
performance, not consistency. The presence or absence of a row's txnUUID in
the IncompleteTransactions table is the ultimate source of truth, and rows
whose txnUUID are not null will be checked against that truth in the read
path.

And yes, it is a good point, failures with this model will accumulate and
degrade performance if you never clear out old failed transactions. The
tables we have that use this generally use TTLs so we don't really care as
long as irrecoverable transaction failures are very rare.

Thanks,
Cody

On Wed, Dec 7, 2016 at 1:56 PM, Voytek Jarnot 
wrote:

> Appreciate the long writeup Cody.
>
> Yeah, we're good with temporary inconsistency (thankfully) as well.  I'm
> going to try to ride the batch train and hope it doesn't derail - our load
> is fairly static (or, more precisely, increase in load is fairly slow and
> can be projected).
>
> Enjoyed your two-phase commit text.  Presumably one would also have some
> cleanup implementation that culls any failed updates (write.5) which could
> be identified in read.3 / read.4?  Still a disconnect possible between
> write.3 and write.4, but there's always something...
>
> We're insert-only (well, with some deletes via TTL, but anyway), so that's
> somewhat tempting, but I'd rather not prematurely optimize.  Unless, of
> course, anyone's got experience such that "batches over XXkb are definitely
> going to be a problem".
>
> Appreciate everyone's time.
> --Voytek Jarnot
>
> On Wed, Dec 7, 2016 at 11:31 AM, Cody Yancey  wrote:
>
>> Hi Voytek,
>> I think the way you are using it is definitely the canonical way.
>> Unfortunately, as you learned, there are some gotchas. We tried
>> substantially increasing the batch size and it worked for a while, until we
>> reached new scale, and we increased it again, and so forth. It works, but
>> soon you start getting write timeouts, lots of them. And the thing about
>> multi-partition batch statements is that they offer atomicity, but not
>> isolation. This means your database can temporarily be in an inconsistent
>> state while writes are propagating to the various machines.
>>
>> For our use case, we could deal with temporary inconsistency, as long as
>> it was for a strictly bounded period of time, on the order of a few
>> seconds. Unfortunately, as with all things eventually consistent, it
>> degrades to "totally inconsistent" when your database is under heavy load
>> and the time-bounds expand beyond what the application can handle. When a
>> batch write times out, it often still succeeds (eventually) but your tables
>> can be inconsistent for
>>
>> minutes, even while nodetool status shows all nodes up and normal.
>>
>> But there is another way, that requires us to take a page from our RDBMS
>> ancestors' book: multi-phase commit.
>>
>> Similar to logged batch writes, multi-phase commit patterns typically
>> entail some write amplification cost for the benefit of stronger
>> consistency guarantees across isolatable units (in Cassandra's case,
>> *partitions*). However, multi-phase commit offers stronger guarantees
>> that batch writes, and ALL of the additional write load is completely
>> distributed as per your load-balancing policy, where as batch writes all go
>> through one coordinator node, then get written in their entirety to the
>> batch log on two or three nodes, and then get dispersed in a distributed
>> fashion from there.
>>
>> A typical two-phase commit pattern looks like this:
>>
>> The Write Path
>>
>>1. The client code chooses a random UUID.
>>2. The client writes the UUID into the IncompleteTransactions table,
>>which only has one column, the transactionUUID.
>>3. The client makes all of the inserts involved in the transaction,
>>IN PARALLEL, with the transactionUUID duplicated in every inserted row.
>>4. The client deletes the UUID from IncompleteTransactions table.
>>5. The client makes parallel updates to all of the rows it inserted,
>>IN PARALLEL, setting the transactionUUID to null.
>>
>> The Read Path
>>
>>1. The client reads some rows from a partition. If this particular
>>client request can handle extraneous rows, you are done. If not, read on 
>> to
>>step #2.
>>2. The client gathers the set of unique transactionUUIDs. In the main
>>case, they've all been deleted by step #5 in the Write Path. If not, go to
>>#3.
>>3. For remaining transactionUUIDs (whic

Re: Batch size warnings

2016-12-07 Thread Cody Yancey
Hi Voytek,
I think the way you are using it is definitely the canonical way.
Unfortunately, as you learned, there are some gotchas. We tried
substantially increasing the batch size and it worked for a while, until we
reached new scale, and we increased it again, and so forth. It works, but
soon you start getting write timeouts, lots of them. And the thing about
multi-partition batch statements is that they offer atomicity, but not
isolation. This means your database can temporarily be in an inconsistent
state while writes are propagating to the various machines.

For our use case, we could deal with temporary inconsistency, as long as it
was for a strictly bounded period of time, on the order of a few seconds.
Unfortunately, as with all things eventually consistent, it degrades to
"totally inconsistent" when your database is under heavy load and the
time-bounds expand beyond what the application can handle. When a batch
write times out, it often still succeeds (eventually) but your tables can
be inconsistent for

minutes, even while nodetool status shows all nodes up and normal.

But there is another way, that requires us to take a page from our RDBMS
ancestors' book: multi-phase commit.

Similar to logged batch writes, multi-phase commit patterns typically
entail some write amplification cost for the benefit of stronger
consistency guarantees across isolatable units (in Cassandra's case,
*partitions*). However, multi-phase commit offers stronger guarantees that
batch writes, and ALL of the additional write load is completely
distributed as per your load-balancing policy, where as batch writes all go
through one coordinator node, then get written in their entirety to the
batch log on two or three nodes, and then get dispersed in a distributed
fashion from there.

A typical two-phase commit pattern looks like this:

The Write Path

   1. The client code chooses a random UUID.
   2. The client writes the UUID into the IncompleteTransactions table,
   which only has one column, the transactionUUID.
   3. The client makes all of the inserts involved in the transaction, IN
   PARALLEL, with the transactionUUID duplicated in every inserted row.
   4. The client deletes the UUID from IncompleteTransactions table.
   5. The client makes parallel updates to all of the rows it inserted, IN
   PARALLEL, setting the transactionUUID to null.

The Read Path

   1. The client reads some rows from a partition. If this particular
   client request can handle extraneous rows, you are done. If not, read on to
   step #2.
   2. The client gathers the set of unique transactionUUIDs. In the main
   case, they've all been deleted by step #5 in the Write Path. If not, go to
   #3.
   3. For remaining transactionUUIDs (which should be a very small number),
   query the IncompleteTransactions table.
   4. The client code culls rows where the transactionUUID existed in the
   IncompleteTransactions table.

This is just an example, one that is reasonably performant for ledger-style
non-updated inserts. For transactions involving updates to possibly
existing data, more effort is required, generally the client needs to be
smart enough to merge updates based on a timestamp, with a periodic batch
job that cleans out obsolete inserts. If it feels like reinventing the
wheel, that's because it is. But it just might be the quickest path to what
you need.

Thanks,
Cody

On Wed, Dec 7, 2016 at 10:15 AM, Edward Capriolo 
wrote:

> I have been circling around a thought process over batches. Now that
> Cassandra has aggregating functions, it might be possible write a type of
> record that has an END_OF_BATCH type marker and the data can be suppressed
> from view until it was all there.
>
> IE you write something like a checksum record that an intelligent client
> can use to tell if the rest of the batch is complete.
>
> On Wed, Dec 7, 2016 at 11:58 AM, Voytek Jarnot 
> wrote:
>
>> Been about a month since I have up on it, but it was very much related to
>> the stuff you're dealing with ... Basically Cassandra just stepping on its
>> own er, tripping over its own feet streaming MVs.
>>
>> On Dec 7, 2016 10:45 AM, "Benjamin Roth"  wrote:
>>
>>> I meant the mv thing
>>>
>>> Am 07.12.2016 17:27 schrieb "Voytek Jarnot" :
>>>
 Sure, about which part?

 default batch size warning is 5kb
 I've increased it to 30kb, and will need to increase to 40kb (8x
 default setting) to avoid WARN log messages about batch sizes.  I do
 realize it's just a WARNing, but may as well avoid those if I can configure
 it out.  That said, having to increase it so substantially (and we're only
 dealing with 5 tables) is making me wonder if I'm not taking the correct
 approach in terms of using batches to guarantee atomicity.

 On Wed, Dec 7, 2016 at 10:13 AM, Benjamin Roth >>> > wrote:

> Could you please be more specific?
>
> Am 07.12.2016 17:10 schrieb "Voytek Jarnot" :
>
>> Should've mentioned - run

Re: Why does `now()` produce different times within the same query?

2016-12-01 Thread Cody Yancey
On Thu, Dec 1, 2016 at 11:09 AM Sylvain Lebresne 
wrote:

> there is much much more trivial solution: generate it client side. The
> `now()` function is a small convenience but there is nothing you cannot do
> without it client side
>

Please see my post above as to why this is a bad idea for inserts based on
request time where knowing the time the request was made is actually
important.

Cody

>


Re: Why does `now()` produce different times within the same query?

2016-11-30 Thread Cody Yancey
This is not a bug, and in fact changing it would be a serious bug.

False. Absolutely no consumer would be broken by a change to guarantee an
identical time component that isn't broken already, for the simple reason
your code already has to handle that case, as it is in fact the majority
case RIGHT NOW. Users can hit this bug, in production, because unit tests
might not experienced it! The time component should be the time that the
command was processed by the coordinator node.

 would one expect a java/py/bash script that loops

Individual Cassandra writes (which is what OP is referring to specifically)
are not loops. They are in almost every case atomic operations that either
succeed completely or fail completely. Allowing a single atomic operation
to witness multiple times in these corner cases is not only surprising, as
this thread demonstrates, it is also needlessly restricting to what
developers can use the database for, and provides NO BENEFIT.

Calling now PRIOR to initiating multiple inserts is in most cases
exactly what one does...the ONLY practice is to set the value before
initiating the sequence of calls

Also false. Cassandra does not have a way of doing this on the coordinator
node rather than the client device, and as I already showed, the client
device is the wrong place to do it in situations where guaranteeing bounded
clock-skew actually makes a difference one way or the other.

Thanks,
Cody



On Wed, Nov 30, 2016 at 8:02 PM, daemeon reiydelle 
wrote:

> This is not a bug, and in fact changing it would be a serious bug.
>
> What it is is a wonderful case of bad coding: would one expect a
> java/py/bash script that loops on a bunch of read/execut/update calls where
> each iteration calls time to return the same exact time for the duration of
> the execution of the code? Whether the code runs for 5 seconds or 5 hours?
>
> Every call to a system call is unique, including within C*. Calling now
> PRIOR to initiating multiple inserts is in most cases exactly what one does
> to assure unique time stamps FOR THE BATCH OF INSERTS. To get a nearly
> identical system time as would be the uuid of the row, one tries to call
> time as close to just before the insert as possible. Then repeat.
>
> You have a logic issue in your code. If you want the same value for a set
> of calls, the ONLY practice is to set the value before initiating the
> sequence of calls.
>
>
>
> *...*
>
>
>
> *Daemeon C.M. ReiydelleUSA (+1) 415.501.0198 <(415)%20501-0198>London
> (+44) (0) 20 8144 9872 <+44%2020%208144%209872>*
>
> On Wed, Nov 30, 2016 at 6:16 PM, Cody Yancey  wrote:
>
>> Getting the same TimeUUID values might be a major problem. Getting two
>> different TimeUUIDs that at least have time component would not be a major
>> problem as this is the main case today. Getting different time components
>> is actually the corner case, and it is a corner case that breaks
>> Internet-of-Things applications. We can tightly control clock skew in our
>> cluster. We most definitely CANNOT control clock skew on the thousands of
>> sensors that write to our cluster.
>>
>> Thanks,
>> Cody
>>
>> On Wed, Nov 30, 2016 at 5:33 PM, Robert Wille  wrote:
>>
>>> In my opinion, this is not broken and “fixing” it would break existing
>>> code. Consider a batch that includes multiple inserts, each of which
>>> inserts the value returned by now(). Getting the same UUID for each insert
>>> would be a major problem.
>>>
>>> Cheers
>>>
>>> Robert
>>>
>>>
>>> On Nov 30, 2016, at 4:46 PM, Todd Fast 
>>> wrote:
>>>
>>> FWIW I'd suggest opening a bug--this behavior is certainly quite
>>> unexpected and more than just a documentation issue. In general I can't
>>> imagine any desirable properties of the current implementation, and there
>>> are likely a bunch of latent bugs sitting out there, so it should be fixed.
>>>
>>> Todd
>>>
>>> On Wed, Nov 30, 2016 at 12:37 PM Terry Liu  wrote:
>>>
>>>> Sorry for my typo. Obviously, I meant:
>>>> "It appears that a single query that calls Cassandra's`now()` time
>>>> function *multiple times *may actually cause a query to write or
>>>> return different times."
>>>>
>>>> Less of a surprise now that I realize more about the implementation,
>>>> but I agree that more explicit documentation around when exactly the
>>>> "execution" of each now() statement happens and what implications it has
>>>> for the resulting timestamps would be helpful when running into this.
>>>>
>

Re: Why does `now()` produce different times within the same query?

2016-11-30 Thread Cody Yancey
Getting the same TimeUUID values might be a major problem. Getting two
different TimeUUIDs that at least have time component would not be a major
problem as this is the main case today. Getting different time components
is actually the corner case, and it is a corner case that breaks
Internet-of-Things applications. We can tightly control clock skew in our
cluster. We most definitely CANNOT control clock skew on the thousands of
sensors that write to our cluster.

Thanks,
Cody

On Wed, Nov 30, 2016 at 5:33 PM, Robert Wille  wrote:

> In my opinion, this is not broken and “fixing” it would break existing
> code. Consider a batch that includes multiple inserts, each of which
> inserts the value returned by now(). Getting the same UUID for each insert
> would be a major problem.
>
> Cheers
>
> Robert
>
>
> On Nov 30, 2016, at 4:46 PM, Todd Fast  wrote:
>
> FWIW I'd suggest opening a bug--this behavior is certainly quite
> unexpected and more than just a documentation issue. In general I can't
> imagine any desirable properties of the current implementation, and there
> are likely a bunch of latent bugs sitting out there, so it should be fixed.
>
> Todd
>
> On Wed, Nov 30, 2016 at 12:37 PM Terry Liu  wrote:
>
>> Sorry for my typo. Obviously, I meant:
>> "It appears that a single query that calls Cassandra's`now()` time
>> function *multiple times *may actually cause a query to write or return
>> different times."
>>
>> Less of a surprise now that I realize more about the implementation, but
>> I agree that more explicit documentation around when exactly the
>> "execution" of each now() statement happens and what implications it has
>> for the resulting timestamps would be helpful when running into this.
>>
>> Thanks for the quick responses!
>>
>> -Terry
>>
>>
>>
>> On Tue, Nov 29, 2016 at 2:45 PM, Marko Švaljek 
>> wrote:
>>
>> every now() call in statement is under the hood "replaced" with newly
>> generated uuid.
>>
>> It can happen that they belong to  different milliseconds in time.
>>
>> If you need to have same timestamps you need to set them on the client
>> side.
>>
>>
>> @msvaljek 
>>
>> 2016-11-29 22:49 GMT+01:00 Terry Liu :
>>
>> It appears that a single query that calls Cassandra's `now()` time
>> function may actually cause a query to write or return different times.
>>
>> Is this the expected or defined behavior, and if so, why does it behave
>> like this rather than evaluating `now()` once across an entire statement?
>>
>> This really affects UPDATE statements but to test it more easily, you
>> could try something like:
>>
>> SELECT toTimestamp(now()) as a, toTimestamp(now()) as b
>> FROM keyspace.table
>> LIMIT 100;
>>
>> If you run that a few times, you should eventually see that the timestamp
>> returned moves onto the next millisecond mid-query.
>>
>> --
>> *Software Engineer*
>> Turnitin - http://www.turnitin.com
>> t...@turnitin.com
>>
>>
>>
>>
>>
>> --
>> *Software Engineer*
>> Turnitin - http://www.turnitin.com
>> t...@turnitin.com
>>
>
>


Re: Java Collections.emptyList inserted as null object in cassandra

2016-11-29 Thread Cody Yancey
It is not possible. Internally Cassandra flattens your data out into a set
of key-value-pairs. All collection types, including lists, are nothing more
than a thin layer of schema over a clustered set of key-value-pairs, so on
disk there is no difference between an empty list and a list that doesn't
exist. If you need to differentiate between the two, you will need a
separate "isSet" column that is a 1 or a 0 to indicate the presence or
absence of a list and write your client-side code to branch on that new
column instead.

Thanks,
Cody

On Tue, Nov 29, 2016 at 6:47 AM, Selvam Raman  wrote:

> Filed Type in cassandra : List
>
> I am trying to insert  Collections.emptyList() from spark to cassandra
> list field. In cassandra it stores as null object.
>
> How can i avoid null values here.
>
> --
> Selvam Raman
> "லஞ்சம் தவிர்த்து நெஞ்சம் நிமிர்த்து"
>


Re: Java Collections.emptyList inserted as null object in cassandra

2016-11-29 Thread Cody Yancey
It is not possible. Internally Cassandra flattens your data out into a set
of key-value-pairs. All collection types, including lists, are nothing more
than a thin layer of schema over a clustered set of key-value-pairs, so on
disk there is no difference between an empty list and a list that doesn't
exist. If you need to differentiate between the two, you will need a
separate "isSet" column that is a 1 or a 0 to indicate the presence or
absence of a list and write your client-side code to branch on that new
column instead.

Thanks,
Cody

On Tue, Nov 29, 2016 at 6:47 AM, Selvam Raman  wrote:

> Filed Type in cassandra : List
>
> I am trying to insert  Collections.emptyList() from spark to cassandra
> list field. In cassandra it stores as null object.
>
> How can i avoid null values here.
>
> --
> Selvam Raman
> "லஞ்சம் தவிர்த்து நெஞ்சம் நிமிர்த்து"
>


Re: Cannot mix counter and non counter columns in the same table

2016-11-01 Thread Cody Yancey
I agree it makes code messier, but you aren't really losing anything by
separating them out into separate tables and then doing parallel queries.

Counter tables already don't support atomic batch operations (all batch
operations are unlogged), CAS operations (LWTs not supported) and have a
whole host of other gotchas that don't apply as much to Cassandra but have
more to do with the mathematical underpinnings of non-idempotent operations
in a world where the Two Generals problem is unsolved.

If Cassandra WAS to allow mixing of storage paradigms into one "logical"
table it would probably be just two separate tables under the hood anyway
since the write path is so different.

This isn't Stockholm Syndrome for Cassandra as much as it is Stockholm
Syndrome for databases. I've never used a database that could count very
well, even non-distributed databases like postgres or mysql. Cassandra's
implementation is at least fast and scalable.

Thanks,
Cody

On Tue, Nov 1, 2016 at 2:13 PM, Edward Capriolo 
wrote:

> Here is a solution that I have leverage. Ignore the count of the value and
> use a multi-part column name as it's value.
>
> For example:
>
> create column family stuff (
> rowkey string,
> column string,
> value string.
> counter_to_ignore long,
> primary key( rowkey, column, value));
>
>
>
> On Tue, Nov 1, 2016 at 9:29 AM, Ali Akhtar  wrote:
>
>> That's a terrible gotcha rule.
>>
>> On Tue, Nov 1, 2016 at 6:27 PM, Cody Yancey  wrote:
>>
>>> In your table schema, you have KEYS and you have VALUES. Your KEYS are
>>> text, but they could be any non-counter type or compound thereof. KEYS
>>> obviously cannot ever be counters.
>>>
>>> Your VALUES, however, must be either all counters or all non-counters.
>>> The official example you posted conforms to this limitation.
>>>
>>> Thanks,
>>> Cody
>>>
>>> On Nov 1, 2016 7:16 AM, "Ali Akhtar"  wrote:
>>>
>>>> I'm not referring to the primary key, just to other columns.
>>>>
>>>> My primary key is a text, and my table contains a mix of texts, ints,
>>>> and timestamps.
>>>>
>>>> If I try to change one of the ints to a counter and run the create
>>>> table query, I get the error ' Cannot mix counter and non counter
>>>> columns in the same table'
>>>>
>>>>
>>>> On Tue, Nov 1, 2016 at 6:11 PM, Cody Yancey  wrote:
>>>>
>>>>> For counter tables, non-counter types are of course allowed in the
>>>>> primary key. Counters would be meaningless otherwise.
>>>>>
>>>>> Thanks,
>>>>> Cody
>>>>>
>>>>> On Nov 1, 2016 7:00 AM, "Ali Akhtar"  wrote:
>>>>>
>>>>>> In the documentation for counters:
>>>>>>
>>>>>> https://docs.datastax.com/en/cql/3.1/cql/cql_using/use_counter_t.html
>>>>>>
>>>>>> The example table is created via:
>>>>>>
>>>>>> CREATE TABLE counterks.page_view_counts
>>>>>>   (counter_value counter,
>>>>>>   url_name varchar,
>>>>>>   page_name varchar,
>>>>>>   PRIMARY KEY (url_name, page_name)
>>>>>> );
>>>>>>
>>>>>> Yet if I try to create a table with a mixture of texts, ints,
>>>>>> timestamps, and counters, i get the error ' Cannot mix counter and non
>>>>>> counter columns in the same table'
>>>>>>
>>>>>> Is that supposed to be allowed or not allowed, given that the
>>>>>> official example contains a mix of counters and non-counters?
>>>>>>
>>>>>
>>>>
>>
>


Re: Cannot mix counter and non counter columns in the same table

2016-11-01 Thread Cody Yancey
In your table schema, you have KEYS and you have VALUES. Your KEYS are
text, but they could be any non-counter type or compound thereof. KEYS
obviously cannot ever be counters.

Your VALUES, however, must be either all counters or all non-counters. The
official example you posted conforms to this limitation.

Thanks,
Cody

On Nov 1, 2016 7:16 AM, "Ali Akhtar"  wrote:

> I'm not referring to the primary key, just to other columns.
>
> My primary key is a text, and my table contains a mix of texts, ints, and
> timestamps.
>
> If I try to change one of the ints to a counter and run the create table
> query, I get the error ' Cannot mix counter and non counter columns in
> the same table'
>
>
> On Tue, Nov 1, 2016 at 6:11 PM, Cody Yancey  wrote:
>
>> For counter tables, non-counter types are of course allowed in the
>> primary key. Counters would be meaningless otherwise.
>>
>> Thanks,
>> Cody
>>
>> On Nov 1, 2016 7:00 AM, "Ali Akhtar"  wrote:
>>
>>> In the documentation for counters:
>>>
>>> https://docs.datastax.com/en/cql/3.1/cql/cql_using/use_counter_t.html
>>>
>>> The example table is created via:
>>>
>>> CREATE TABLE counterks.page_view_counts
>>>   (counter_value counter,
>>>   url_name varchar,
>>>   page_name varchar,
>>>   PRIMARY KEY (url_name, page_name)
>>> );
>>>
>>> Yet if I try to create a table with a mixture of texts, ints,
>>> timestamps, and counters, i get the error ' Cannot mix counter and non
>>> counter columns in the same table'
>>>
>>> Is that supposed to be allowed or not allowed, given that the official
>>> example contains a mix of counters and non-counters?
>>>
>>
>


Re: Cannot mix counter and non counter columns in the same table

2016-11-01 Thread Cody Yancey
For counter tables, non-counter types are of course allowed in the primary
key. Counters would be meaningless otherwise.

Thanks,
Cody

On Nov 1, 2016 7:00 AM, "Ali Akhtar"  wrote:

> In the documentation for counters:
>
> https://docs.datastax.com/en/cql/3.1/cql/cql_using/use_counter_t.html
>
> The example table is created via:
>
> CREATE TABLE counterks.page_view_counts
>   (counter_value counter,
>   url_name varchar,
>   page_name varchar,
>   PRIMARY KEY (url_name, page_name)
> );
>
> Yet if I try to create a table with a mixture of texts, ints, timestamps,
> and counters, i get the error ' Cannot mix counter and non counter columns
> in the same table'
>
> Is that supposed to be allowed or not allowed, given that the official
> example contains a mix of counters and non-counters?
>


Re: Problems with schema creation

2016-09-19 Thread Cody Yancey
Hi Josh,
I too have had this issue on several clusters I manage, particularly when
making schema changes. The worst part is, those nodes don't restart, and
the tables can't be dropped. Basically you have to rebuild your whole
cluster which often takes down time. Others have seen this on 3.0.x and it
has been documented here:

https://issues.apache.org/jira/browse/CASSANDRA-12131

If you have a good repro case I'm sure that would be a great help towards
helping this bug get some much needed attention.

Thanks,
Cody

On Mon, Sep 19, 2016 at 1:22 PM, Josh Smith 
wrote:

> I have an automated tool we created which will create a keyspace, its
> tables, and add indexes in solr.  But when I run the tool even for a new
> keyspace I end up getting ghost tables with the name “”.  If I look in
> system_schema.tables I see a bunch of tables all named
> (\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00). Am I
> creating the tables and schema too fast or is something else wrong? Has
> anyone else run into this problem before? I have searched the mailing list
> and google but I have not found anything.  I am running DSE 5.0 (C*3.0.2)
> on m4.4xl 5 nodes currently.  Any help would be appreciated.
>
>
>
> Josh Smith
>