Missing table suffix in data directory directories

2016-10-09 Thread Jason Kania
Hello,
In the data directory of my 3.0.9 installation, I have directories with both 
suffixes and without:
periodicReading
periodicReadingTemp-76eb7510096811e68a7421c8b9466352
The directories with and without suffixes are being updated and for those with 
a suffix, the suffix matches the output of this command:
SELECT keyspace_name, table_name, id FROM system_schema.tables ;
Can someone indicate why some would have suffixes and others not?
Thanks,
Jason


Re: Understanding cassandra data directory contents

2016-10-08 Thread Jason Kania
Hi Vladamir,
Thanks for the response. I assume then that it is safe to remove the 
directories that are not current as per the system_schema.tables table. I have 
dozens of the same table and haven't dropped and added nearly that many times. 
Do any of the nodetool or other commands clean up these unused directories?

Thanks,
Jason Kania

  From: Vladimir Yudovin <vla...@winguzone.com>
 To: user@cassandra.apache.org; Jason Kania <jason.ka...@ymail.com> 
 Sent: Saturday, October 8, 2016 2:05 PM
 Subject: Re: Understanding cassandra data directory contents
   
Each table has unique id (suffix). If you drop and then recreate table with the 
same name it gets new id.

Try
SELECT keyspace_name, table_name, id FROM system_schema.tables ;
to determinate actual ID.

You can limit request to specific keyspace or table.


Best regards, Vladimir Yudovin, 
Winguzone - Hosted Cloud Cassandra on Azure and SoftLayer.
Launch your cluster in minutes.



 On Sat, 08 Oct 2016 13:42:19 -0400 Jason Kania<jason.ka...@ymail.com> 
wrote  

Hello,
I am using Cassandra 3.0.9 and I have encountered an issue where the nodes in 
my 3 node cluster have vastly different amounts of data even though they should 
be roughly the same. When I looked through the data directory for my database 
on two of the nodes, I see a number of directories with the same prefix, eg:
periodicReading-76eb7510096811e68a7421c8b9466352,periodicReading-453d55a0501d11e68623a9d2b6f96e86...

Only one directory with a specific table name prefix has a current date and the 
rest are older.
In contrast, on the node with the least space used, each directory has a unique 
prefix (not shared).
I am wondering what the contents of a Cassandra database directory should look 
like. Are there supposed to be multiple entries for a given table or just one?
If just one, what would be a procedure to determine if the other directories 
with the same table are junk that can be removed.

Thanks,
Jason





   

Understanding cassandra data directory contents

2016-10-08 Thread Jason Kania
Hello,
I am using Cassandra 3.0.9 and I have encountered an issue where the nodes in 
my 3 node cluster have vastly different amounts of data even though they should 
be roughly the same. When I looked through the data directory for my database 
on two of the nodes, I see a number of directories with the same prefix, eg:
periodicReading-76eb7510096811e68a7421c8b9466352,periodicReading-453d55a0501d11e68623a9d2b6f96e86...

Only one directory with a specific table name prefix has a current date and the 
rest are older.
In contrast, on the node with the least space used, each directory has a unique 
prefix (not shared).
I am wondering what the contents of a Cassandra database directory should look 
like. Are there supposed to be multiple entries for a given table or just one?
If just one, what would be a procedure to determine if the other directories 
with the same table are junk that can be removed.

Thanks,
Jason


Re: Nodetool repair inconsistencies

2016-06-08 Thread Jason Kania
Hi Paul,
I have tried running 'nodetool compact' and the situation remains the same 
after I deleted the files that caused 'nodetool compact' to generate an 
exception in the first place.
My concern is that if I delete some sstable sets from a directory or even if I 
completely eliminate the sstables in a directory on one machine, run 'nodetool 
repair' followed by 'nodetool compact', that directory remains empty. My 
understanding has been that these equivalently named directories should contain 
roughly the same amount of content.
Thanks,
Jason

  From: Paul Fife <paulf...@gmail.com>
 To: user@cassandra.apache.org; Jason Kania <jason.ka...@ymail.com> 
 Sent: Wednesday, June 8, 2016 12:55 PM
 Subject: Re: Nodetool repair inconsistencies
   
Hi Jason -
Did you run a major compaction after the repair completed? Do you have other 
reasons besides the number/size of sstables to believe all nodes don't have a 
copy of the current data at the end of the repair operation?
Thanks,Paul
On Wed, Jun 8, 2016 at 8:12 AM, Jason Kania <jason.ka...@ymail.com> wrote:

Hi Romain,
The problem is that there is no error to share. I am focusing on the 
inconsistency that when I run nodetool repair, get no errors and yet the 
content in the same directory on the different nodes is vastly different. This 
lack of an error is nature of my question, not the nodetool compact error.
Thanks,
Jason
  From: Romain Hardouin <romainh...@yahoo.fr>
 To: "user@cassandra.apache.org" <user@cassandra.apache.org>; Jason Kania 
<jason.ka...@ymail.com> 
 Sent: Wednesday, June 8, 2016 8:30 AM
 Subject: Re: Nodetool repair inconsistencies
  
Hi Jason,
It's difficult for the community to help you if you don't share the error 
;-)What the logs said when you ran a major compaction? (i.e. the first error 
you encountered) 
Best,
Romain

Le Mercredi 8 juin 2016 3h34, Jason Kania <jason.ka...@ymail.com> a écrit :
 

 I am running a 3 node cluster of 3.0.6 instances and encountered an error when 
running nodetool compact. I then ran nodetool repair. No errors were returned.
I then attempted to run nodetool compact again, but received the same error so 
the repair made no correction and reported no errors.
After that, I moved the problematic files out of the directory, restarted 
cassandra and attempted the repair again. The repair again completed without 
errors, however, no files were added to the directory that had contained the 
corrupt files. So nodetool repair does not seem to be making actual repairs.
I started looking around and numerous directories have vastly different amounts 
of content across the 3 nodes. There are 3 replicas so I would expect to find 
similar amounts of content in the same data directory on the different nodes.

Is there any way to dig deeper into this? I don't want to be caught because 
replication/repair is silently failing. I noticed that there is always an "some 
repair failed" amongst the repair output but that is so completely unhelpful 
and has always been present.

Thanks,
Jason


   

   



  

Re: Nodetool repair inconsistencies

2016-06-08 Thread Jason Kania
Hi Romain,
The problem is that there is no error to share. I am focusing on the 
inconsistency that when I run nodetool repair, get no errors and yet the 
content in the same directory on the different nodes is vastly different. This 
lack of an error is nature of my question, not the nodetool compact error.
Thanks,
Jason
  From: Romain Hardouin <romainh...@yahoo.fr>
 To: "user@cassandra.apache.org" <user@cassandra.apache.org>; Jason Kania 
<jason.ka...@ymail.com> 
 Sent: Wednesday, June 8, 2016 8:30 AM
 Subject: Re: Nodetool repair inconsistencies
   
Hi Jason,
It's difficult for the community to help you if you don't share the error 
;-)What the logs said when you ran a major compaction? (i.e. the first error 
you encountered) 
Best,
Romain

Le Mercredi 8 juin 2016 3h34, Jason Kania <jason.ka...@ymail.com> a écrit :
 

 I am running a 3 node cluster of 3.0.6 instances and encountered an error when 
running nodetool compact. I then ran nodetool repair. No errors were returned.
I then attempted to run nodetool compact again, but received the same error so 
the repair made no correction and reported no errors.
After that, I moved the problematic files out of the directory, restarted 
cassandra and attempted the repair again. The repair again completed without 
errors, however, no files were added to the directory that had contained the 
corrupt files. So nodetool repair does not seem to be making actual repairs.
I started looking around and numerous directories have vastly different amounts 
of content across the 3 nodes. There are 3 replicas so I would expect to find 
similar amounts of content in the same data directory on the different nodes.

Is there any way to dig deeper into this? I don't want to be caught because 
replication/repair is silently failing. I noticed that there is always an "some 
repair failed" amongst the repair output but that is so completely unhelpful 
and has always been present.

Thanks,
Jason


   

  

Nodetool repair inconsistencies

2016-06-07 Thread Jason Kania
I am running a 3 node cluster of 3.0.6 instances and encountered an error when 
running nodetool compact. I then ran nodetool repair. No errors were returned.
I then attempted to run nodetool compact again, but received the same error so 
the repair made no correction and reported no errors.
After that, I moved the problematic files out of the directory, restarted 
cassandra and attempted the repair again. The repair again completed without 
errors, however, no files were added to the directory that had contained the 
corrupt files. So nodetool repair does not seem to be making actual repairs.
I started looking around and numerous directories have vastly different amounts 
of content across the 3 nodes. There are 3 replicas so I would expect to find 
similar amounts of content in the same data directory on the different nodes.

Is there any way to dig deeper into this? I don't want to be caught because 
replication/repair is silently failing. I noticed that there is always an "some 
repair failed" amongst the repair output but that is so completely unhelpful 
and has always been present.

Thanks,
Jason


Re: Inconsistent query results and node state

2016-03-31 Thread Jason Kania
Thanks for responding. The problems that we are having are in Cassandra 3.03 
and 3.0.4. We had upgraded to see if the problem went away.

The values have been out of sync this way for some time and we cannot get a row 
with the 1969 timestamp in any query that directly queries on the timestamp. 
The 1969-12-31 19:00 value comes inconsistently in range queries but seems to 
be tied to the 192.168.10.9 node.
We tried the writetime function value in the query on time but it is not 
allowed as the time column is part of the primary key. Instead we used it on an 
additional field that is written at the same time (classId):
subscriberId  sensorUnitId  sensorId  time  
writetime(classId)   JASKAN 0  0  2015-05-24 02:09  
  1458178461272000
   JASKAN 0  0  1969-12-31 19:00    1458178801214000
   JASKAN 0  0  2016-01-21 02:10    1458178801221000
   JASKAN 0  0  2016-01-21 02:10    1458178801226000
   JASKAN 0  0  2016-01-21 02:10    1458178801231000
   JASKAN 0  0  2016-01-21 02:11    1458178801235000
   JASKAN 0  0  2016-01-21 02:22    1458178801241000
   JASKAN 0  0  2016-01-21 02:22    1458178801247000
   JASKAN 0  0  2016-01-21 02:22    1458178801252000
   JASKAN 0  0  2016-01-21 02:22    1458178801258000
Based on the other column values in the table row, we confirmed that the actual 
time in the row showing up with the 1969-12-31 19:00 timestamp is associated 
with the following timestamp.
 subscriberId  sensorUnitId  sensorId  time  
writetime(classId)
   JASKAN 0  0  2016-01-21 02:09    1458178801214000

The 2016-01-21 02:09 timestamp is always present on all nodes if queried 
directly based on using tracing.
To me it just seems like the timestamp column value is sometimes not being set 
somewhere in the pipeline and the result is the epoch 0 value.
Thoughts on how to proceed?
Thanks,

Jason

  From: Tyler Hobbs <ty...@datastax.com>
 To: user@cassandra.apache.org 
 Sent: Wednesday, March 30, 2016 11:31 AM
 Subject: Re: Inconsistent query results and node state
   

org.apache.cassandra.service.DigestMismatchException: Mismatch for key 
DecoratedKey(-4908797801227889951, 4a41534b414e) 
(6a6c8ab013d7757e702af50cbdae045c vs 2ece61a01b2a640ac10509f4c49ae6fb)

That key matches the row you mentioned, so it seems like all of the replicas 
should have converged on the same value for that row.  Do you consistently get 
the 1969-12-31 19:00 timestamp back now?  If not, try selecting both "time" and 
"writetime(time)}" from that row and see what write timestamps each of the 
values have.

The ArrayIndexOutOfBoundsException in response to nodetool compact looks like a 
bug.  What version of Cassandra are you running?
 
On Wed, Mar 30, 2016 at 9:59 AM, Kai Wang <dep...@gmail.com> wrote:

Do you have NTP setup on all nodes?

On Tue, Mar 29, 2016 at 11:48 PM, Jason Kania <jason.ka...@ymail.com> wrote:

We have encountered a query inconsistency problem wherein the following query 
returns different results sporadically with invalid values for a timestamp 
field looking like the field is uninitialized (a zero timestamp) in the query 
results.

Attempts to repair and compact have not changed the results.

select "subscriberId","sensorUnitId","sensorId","time" from 
"sensorReadingIndex" where "subscriberId"='JASKAN' AND "sensorUnitId"=0 AND 
"sensorId"=0 ORDER BY "time" LIMIT 10;

Invalid Query Results
subscriberId    sensorUnitId    sensorId    time
JASKAN    0    0    2015-05-24 2:09
JASKAN    0    0    1969-12-31 19:00
JASKAN    0    0    2016-01-21 2:10
JASKAN    0    0    2016-01-21 2:10
JASKAN    0    0    2016-01-21 2:10
JASKAN    0    0    2016-01-21 2:11
JASKAN    0    0    2016-01-21 2:22
JASKAN    0    0    2016-01-21 2:22
JASKAN    0    0    2016-01-21 2:22
JASKAN    0    0    2016-01-21 2:22

Valid Query Results
subscriberId    sensorUnitId    sensorId    time
JASKAN    0    0    2015-05-24 2:09
JASKAN    0    0    2015-05-24 2:09
JASKAN    0    0    2015-05-24 2:10
JASKAN    0    0    2015-05-24 2:10
JASKAN    0    0    2015-05-24 2:10
JASKAN    0    0    2015-05-24 2:10
JASKAN    0    0    2015-05-24 2:11
JASKAN    0    0    2015-05-24 2:13
JASKAN    0    0    2015-05-24 2:13
JASKAN    0    0    2015-05-24 2:14

We have confirmed that the 1969-12-31 timestamp is not within the data based on 
running and number of queries so it looks like the invalid timestamp value is 
generated by the query. The query below returns no row.

select * from "sensorReadingIndex" where "subscriberId"='JASKAN' AND 
"sensorUni

Re: Inconsistent query results and node state

2016-03-31 Thread Jason Kania
Thanks for the response.

All nodes are using NTP.
Thanks,
Jason

  From: Kai Wang <dep...@gmail.com>
 To: user@cassandra.apache.org; Jason Kania <jason.ka...@ymail.com> 
 Sent: Wednesday, March 30, 2016 10:59 AM
 Subject: Re: Inconsistent query results and node state
   
Do you have NTP setup on all nodes?

On Tue, Mar 29, 2016 at 11:48 PM, Jason Kania <jason.ka...@ymail.com> wrote:

We have encountered a query inconsistency problem wherein the following query 
returns different results sporadically with invalid values for a timestamp 
field looking like the field is uninitialized (a zero timestamp) in the query 
results.

Attempts to repair and compact have not changed the results.

select "subscriberId","sensorUnitId","sensorId","time" from 
"sensorReadingIndex" where "subscriberId"='JASKAN' AND "sensorUnitId"=0 AND 
"sensorId"=0 ORDER BY "time" LIMIT 10;

Invalid Query Results
subscriberId    sensorUnitId    sensorId    time
JASKAN    0    0    2015-05-24 2:09
JASKAN    0    0    1969-12-31 19:00
JASKAN    0    0    2016-01-21 2:10
JASKAN    0    0    2016-01-21 2:10
JASKAN    0    0    2016-01-21 2:10
JASKAN    0    0    2016-01-21 2:11
JASKAN    0    0    2016-01-21 2:22
JASKAN    0    0    2016-01-21 2:22
JASKAN    0    0    2016-01-21 2:22
JASKAN    0    0    2016-01-21 2:22

Valid Query Results
subscriberId    sensorUnitId    sensorId    time
JASKAN    0    0    2015-05-24 2:09
JASKAN    0    0    2015-05-24 2:09
JASKAN    0    0    2015-05-24 2:10
JASKAN    0    0    2015-05-24 2:10
JASKAN    0    0    2015-05-24 2:10
JASKAN    0    0    2015-05-24 2:10
JASKAN    0    0    2015-05-24 2:11
JASKAN    0    0    2015-05-24 2:13
JASKAN    0    0    2015-05-24 2:13
JASKAN    0    0    2015-05-24 2:14

We have confirmed that the 1969-12-31 timestamp is not within the data based on 
running and number of queries so it looks like the invalid timestamp value is 
generated by the query. The query below returns no row.

select * from "sensorReadingIndex" where "subscriberId"='JASKAN' AND 
"sensorUnitId"=0 AND "sensorId"=0 AND time='1969-12-31 19:00:00-0500';

No logs are coming out but the following was observed intermittently in the 
tracing output, but not correlated to the invalid query results:

 Digest mismatch: org.apache.cassandra.service.DigestMismatchException: 
Mismatch for key DecoratedKey(-7563144029910940626, 
00064a41534b414e040400) 
(be22d379c18f75c2f51dd6942d2f9356 vs da4e95d571b41303b908e0c5c3fff7ba) 
[ReadRepairStage:3179] | 2016-03-29 23:12:35.025000 | 192.168.10.10 |
An error from the debug log that might be related is:
org.apache.cassandra.service.DigestMismatchException: Mismatch for key 
DecoratedKey(-4908797801227889951, 4a41534b414e) 
(6a6c8ab013d7757e702af50cbdae045c vs 2ece61a01b2a640ac10509f4c49ae6fb)
    at 
org.apache.cassandra.service.DigestResolver.resolve(DigestResolver.java:85) 
~[apache-cassandra-3.0.3.jar:3.0.3]
    at 
org.apache.cassandra.service.ReadCallback$AsyncRepairRunner.run(ReadCallback.java:225)
 ~[apache-cassandra-3.0.3.jar:3.0.3]
    at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
[na:1.8.0_74]
    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[na:1.8.0_74]
    at java.lang.Thread.run(Thread.java:745) [na:1.8.0_74]

The tracing files are attached and seem to show that in the failed case, 
content is skipped because of tombstones if we understand it correctly. This 
could be an inconsistency problem on 192.168.10.9 Unfortunately, attempts to 
compact on 192.168.10.9 only give the following error without any stack trace 
detail and are not fixed with repair.

root@cutthroat:/usr/local/bin/analyzer/bin# nodetool compact
error: null
-- StackTrace --
java.lang.ArrayIndexOutOfBoundsException
Any suggestions on how to fix or what to search for would be much appreciated.
Thanks,
Jason







  

Inconsistent query results and node state

2016-03-29 Thread Jason Kania
We have encountered a query inconsistency problem wherein the following query 
returns different results sporadically with invalid values for a timestamp 
field looking like the field is uninitialized (a zero timestamp) in the query 
results.

Attempts to repair and compact have not changed the results.

select "subscriberId","sensorUnitId","sensorId","time" from 
"sensorReadingIndex" where "subscriberId"='JASKAN' AND "sensorUnitId"=0 AND 
"sensorId"=0 ORDER BY "time" LIMIT 10;

Invalid Query Results
subscriberId    sensorUnitId    sensorId    time
JASKAN    0    0    2015-05-24 2:09
JASKAN    0    0    1969-12-31 19:00
JASKAN    0    0    2016-01-21 2:10
JASKAN    0    0    2016-01-21 2:10
JASKAN    0    0    2016-01-21 2:10
JASKAN    0    0    2016-01-21 2:11
JASKAN    0    0    2016-01-21 2:22
JASKAN    0    0    2016-01-21 2:22
JASKAN    0    0    2016-01-21 2:22
JASKAN    0    0    2016-01-21 2:22

Valid Query Results
subscriberId    sensorUnitId    sensorId    time
JASKAN    0    0    2015-05-24 2:09
JASKAN    0    0    2015-05-24 2:09
JASKAN    0    0    2015-05-24 2:10
JASKAN    0    0    2015-05-24 2:10
JASKAN    0    0    2015-05-24 2:10
JASKAN    0    0    2015-05-24 2:10
JASKAN    0    0    2015-05-24 2:11
JASKAN    0    0    2015-05-24 2:13
JASKAN    0    0    2015-05-24 2:13
JASKAN    0    0    2015-05-24 2:14

We have confirmed that the 1969-12-31 timestamp is not within the data based on 
running and number of queries so it looks like the invalid timestamp value is 
generated by the query. The query below returns no row.

select * from "sensorReadingIndex" where "subscriberId"='JASKAN' AND 
"sensorUnitId"=0 AND "sensorId"=0 AND time='1969-12-31 19:00:00-0500';

No logs are coming out but the following was observed intermittently in the 
tracing output, but not correlated to the invalid query results:

 Digest mismatch: org.apache.cassandra.service.DigestMismatchException: 
Mismatch for key DecoratedKey(-7563144029910940626, 
00064a41534b414e040400) 
(be22d379c18f75c2f51dd6942d2f9356 vs da4e95d571b41303b908e0c5c3fff7ba) 
[ReadRepairStage:3179] | 2016-03-29 23:12:35.025000 | 192.168.10.10 |
An error from the debug log that might be related is:
org.apache.cassandra.service.DigestMismatchException: Mismatch for key 
DecoratedKey(-4908797801227889951, 4a41534b414e) 
(6a6c8ab013d7757e702af50cbdae045c vs 2ece61a01b2a640ac10509f4c49ae6fb)
    at 
org.apache.cassandra.service.DigestResolver.resolve(DigestResolver.java:85) 
~[apache-cassandra-3.0.3.jar:3.0.3]
    at 
org.apache.cassandra.service.ReadCallback$AsyncRepairRunner.run(ReadCallback.java:225)
 ~[apache-cassandra-3.0.3.jar:3.0.3]
    at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
[na:1.8.0_74]
    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[na:1.8.0_74]
    at java.lang.Thread.run(Thread.java:745) [na:1.8.0_74]

The tracing files are attached and seem to show that in the failed case, 
content is skipped because of tombstones if we understand it correctly. This 
could be an inconsistency problem on 192.168.10.9 Unfortunately, attempts to 
compact on 192.168.10.9 only give the following error without any stack trace 
detail and are not fixed with repair.

root@cutthroat:/usr/local/bin/analyzer/bin# nodetool compact
error: null
-- StackTrace --
java.lang.ArrayIndexOutOfBoundsException
Any suggestions on how to fix or what to search for would be much appreciated.
Thanks,
Jason



Tracing session: 09e26410-f626-11e5-8b85-9b8e819c8182

 activity   

 | timestamp  | 
source| source_elapsed
-++---+


  Execute CQL3 query | 2016-03-29 23:18:13.969000 | 
192.168.10.10 |  0
 Parsing select "subscriberId","sensorUnitId","sensorId","time" from 
"sensorReadingIndex" where "subscriberId"='JASKAN' AND "sensorUnitId"=0 AND 
"sensorId"=0 ORDER BY "time" LIMIT 10; [SharedPool-Worker-2] | 2016-03-29 
23:18:13.97 | 192.168.10.10 |181

READ message received from 
/192.168.10.10 [MessagingService-Incoming-/192.168.10.10] | 2016-03-29 
23:18:13.97 |  192.168.10.9 | 20

Re: Strategy for dividing wide rows beyond just adding to the partition key

2016-03-12 Thread Jason Kania
Our analytics currently pulls in all the data for a single sensor reading as we 
use it in its entirety during signal processing. We may add secondary indices 
to the table in the future to pull in broadly classified data, but right now, 
our only goal is this bulk retrieval.
  From: Jack Krupansky <jack.krupan...@gmail.com>
 To: user@cassandra.apache.org 
 Sent: Friday, March 11, 2016 7:25 PM
 Subject: Re: Strategy for dividing wide rows beyond just adding to the 
partition key
  
Thanks, that level of query detail gives us a better picture to focus on. I 
think through this some more over the weekend.
Also, these queries focus on raw, bulk retrieval of sensor data readings, but 
do you have reading-based queries, such as range of an actual sensor reading?
-- Jack Krupansky
On Fri, Mar 11, 2016 at 7:08 PM, Jason Kania <jason.ka...@ymail.com> wrote:

The 5000 readings mentioned would be against a single sensor on a single sensor 
unit.

The scope of the queries on this table is intended to be fairly simple. Here 
are some example queries, without 'sharding', that we would perform on this 
table:

SELECT "time","readings" FROM "sensorReadings"WHERE "sensorUnitId"=5123 AND 
"sensorId"=17 AND time<=?ORDER BY time DESC LIMIT 5000
SELECT "time","readings" FROM "sensorReadings"WHERE "sensorUnitId"=5123 AND 
"sensorId"=17 AND time>=?ORDER BY time LIMIT 5000
SELECT "time","readings" FROM "sensorReadings"WHERE "sensorUnitId"=5123 AND 
"sensorId"=17 AND time<=? AND classification=?
ORDER BY time DESC LIMIT 5000
where 'classification' is secondary index that we expect to add.

In some cases, we have to revisit all values too so a complete table scan is 
needed:
SELECT "time","readings" FROM "sensorReadings"
Getting the "next" and "previous" 5000 readings is also something we do, but is 
manageable from our standpoint as we can look at the range-end timestamps that 
are returned and use those in the subsequent queries.

SELECT "time","readings" FROM "sensorReadings"WHERE "sensorUnitId"=5123 AND 
"sensorId"=17 AND time>=? AND time<=?ORDER BY time LIMIT 5000
Splitting the bulk content out of the main table is something we considered too 
but we didn't find any detail on whether that would solve our timeout problem. 
If there is a reference for using this approach, it would be of interest to us 
to avoid any assumptions on how we would approach it.

A question: Is the probability of a timeout directly linked to a longer seek 
time in reading through a partition's contents? If that is the case, splitting 
the partition keys into a separate table would be straightforward.

Regards,
Jason

  From: Jack Krupansky <jack.krupan...@gmail.com>
 To: user@cassandra.apache.org; Jason Kania <jason.ka...@ymail.com> 
 Sent: Friday, March 11, 2016 6:22 PM
 Subject: Re: Strategy for dividing wide rows beyond just adding to the 
partition key
   
Thanks for the additional information, but there is still not enough color on 
the queries and too much focus on a premature data model.
Is this 5000 readings for a single sensor of a single sensor unit, or for all 
sensors of a specified unit, or... both?
I presume you want "next" and "previous" 5000 readings as well as first and 
last, but... you will have to confirm that.
One technique is to store the bulk of your raw sensor data in a separate table 
and then simply store the PK of that data in your time series. That way you can 
have a much wider row of time series (number of rows) without hitting a bulk 
size issue for the partition. But... I don't want to jump to solutions until we 
have a firmer handle on the query side of the fence.
-- Jack Krupansky
On Fri, Mar 11, 2016 at 5:37 PM, Jason Kania <jason.ka...@ymail.com> wrote:

Jack,
Thanks for the response.
We are targeting our database design to 1 sensor units and each sensor unit 
has 32 sensors. We are seeing about 700 events per day per sensor, each 
providing about 2K of data. Based on keeping each partition to about 10 Mb 
(based on readings we saw on performance), we chose to break our partitions on 
a weekly basis. This is possibly finer than we need as we were seeing timeouts 
only once a single partition was about 150Mb in size

When pulling in data, we will typically need to pull 1 to 4 months of data for 
our analysis and will use only the sensorUnitId and sensorId to uniquely 
identify the data source with the timeShard value used to break up our 
partitions. We have handling to sequentially scan based on our "timeShard" 
value, but don't have a good handle on the determination of the "timeShard" 
portion of the partition key at read time. The data starts coming in when a 
subscriber

Re: Strategy for dividing wide rows beyond just adding to the partition key

2016-03-12 Thread Jason Kania
Hi Carlos,
Thanks for the suggestions.
We are having partition size issues and that was why we started to do custom 
sharding/partition division based on time. As you mentioned, we are having 
problems with identification. Its the identification of shard range that we 
need to understand and our data doesn't necessarily run until the current time. 
My worry with storing that last shard id in another table is that we would 
update the same row in that table all the time creating tombstones.
It is good to know that returning empty partitions is not that costly as that 
is a concern when we don't know where to start and end.
Thanks,
Jason


  From: Carlos Alonso <i...@mrcalonso.com>
 To: "user@cassandra.apache.org" <user@cassandra.apache.org> 
 Sent: Friday, March 11, 2016 7:24 PM
 Subject: Re: Strategy for dividing wide rows beyond just adding to the 
partition key
  
Hi Jason,
If I understand correctly you have no problems with the size of your partitions 
or transactional queries but with the 'identification' of them when having to 
do analytical queries.
I'd then suggest two options:1. Keep using Cassandra and store the first 
'bucket' of each sensor in a separate table to use as the starting point of 
your full scan queries. Then issue async queries incrementing the bucket until 
today (logical end of the data). Cassandra is very efficient at returning empty 
partitions, so querying on empty buckets is normally fine.
2. Periodically offload your 'historic' data to another storage more 
appropriate for analytics (Parquet + S3) and query it using Spark.
Hope it helps
On Saturday, 12 March 2016, Jack Krupansky <jack.krupan...@gmail.com> wrote:

Thanks for the additional information, but there is still not enough color on 
the queries and too much focus on a premature data model.
Is this 5000 readings for a single sensor of a single sensor unit, or for all 
sensors of a specified unit, or... both?
I presume you want "next" and "previous" 5000 readings as well as first and 
last, but... you will have to confirm that.
One technique is to store the bulk of your raw sensor data in a separate table 
and then simply store the PK of that data in your time series. That way you can 
have a much wider row of time series (number of rows) without hitting a bulk 
size issue for the partition. But... I don't want to jump to solutions until we 
have a firmer handle on the query side of the fence.
-- Jack Krupansky
On Fri, Mar 11, 2016 at 5:37 PM, Jason Kania <jason.ka...@ymail.com> wrote:

Jack,
Thanks for the response.
We are targeting our database design to 1 sensor units and each sensor unit 
has 32 sensors. We are seeing about 700 events per day per sensor, each 
providing about 2K of data. Based on keeping each partition to about 10 Mb 
(based on readings we saw on performance), we chose to break our partitions on 
a weekly basis. This is possibly finer than we need as we were seeing timeouts 
only once a single partition was about 150Mb in size

When pulling in data, we will typically need to pull 1 to 4 months of data for 
our analysis and will use only the sensorUnitId and sensorId to uniquely 
identify the data source with the timeShard value used to break up our 
partitions. We have handling to sequentially scan based on our "timeShard" 
value, but don't have a good handle on the determination of the "timeShard" 
portion of the partition key at read time. The data starts coming in when a 
subscriber starts using our system and finishes when they discontinue service 
or put the service on hold temporarily.

When I talk about hotspots, it isn't the time series data that is the concern, 
it is with respect to storing the maximum and minimum timeShard values in 
another table for subsequent lookup or the cost of running the current 
implementation of SELECT DISTINCT. We need to run queries such as getting the 
first or last 5000 sensor readings when we don't know the time frame at which 
they occurred so cannot directly supply the timeShard portion of our partition 
key.

I appreciate your input,
Thanks,
Jason

  From: Jack Krupansky <jack.krupan...@gmail.com>
 To: "user@cassandra.apache.org" <user@cassandra.apache.org> 
 Sent: Friday, March 11, 2016 4:45 PM
 Subject: Re: Strategy for dividing wide rows beyond just adding to the 
partition key
   
I'll stay away from advising on a specific schema per se, but I'll stick to the 
advice that you need to make sure that your queries are depending solely on the 
columns of the primary key or relatively short slices/scans, rather than run 
the risk of very long scans or having to process multiple partitions for a 
single query. That's canned to some extent, but still essential.
Of course we generally wish to avoid hotspots, but with time series they are 
unavoidable. I mean, sure you could place successive events at separate 
partitions, but then you can't do any kin

Re: Strategy for dividing wide rows beyond just adding to the partition key

2016-03-11 Thread Jason Kania
The 5000 readings mentioned would be against a single sensor on a single sensor 
unit.

The scope of the queries on this table is intended to be fairly simple. Here 
are some example queries, without 'sharding', that we would perform on this 
table:

SELECT "time","readings" FROM "sensorReadings"WHERE "sensorUnitId"=5123 AND 
"sensorId"=17 AND time<=?ORDER BY time DESC LIMIT 5000
SELECT "time","readings" FROM "sensorReadings"WHERE "sensorUnitId"=5123 AND 
"sensorId"=17 AND time>=?ORDER BY time LIMIT 5000
SELECT "time","readings" FROM "sensorReadings"WHERE "sensorUnitId"=5123 AND 
"sensorId"=17 AND time<=? AND classification=?
ORDER BY time DESC LIMIT 5000
where 'classification' is secondary index that we expect to add.

In some cases, we have to revisit all values too so a complete table scan is 
needed:
SELECT "time","readings" FROM "sensorReadings"
Getting the "next" and "previous" 5000 readings is also something we do, but is 
manageable from our standpoint as we can look at the range-end timestamps that 
are returned and use those in the subsequent queries.

SELECT "time","readings" FROM "sensorReadings"WHERE "sensorUnitId"=5123 AND 
"sensorId"=17 AND time>=? AND time<=?ORDER BY time LIMIT 5000
Splitting the bulk content out of the main table is something we considered too 
but we didn't find any detail on whether that would solve our timeout problem. 
If there is a reference for using this approach, it would be of interest to us 
to avoid any assumptions on how we would approach it.

A question: Is the probability of a timeout directly linked to a longer seek 
time in reading through a partition's contents? If that is the case, splitting 
the partition keys into a separate table would be straightforward.

Regards,
Jason

  From: Jack Krupansky <jack.krupan...@gmail.com>
 To: user@cassandra.apache.org; Jason Kania <jason.ka...@ymail.com> 
 Sent: Friday, March 11, 2016 6:22 PM
 Subject: Re: Strategy for dividing wide rows beyond just adding to the 
partition key
   
Thanks for the additional information, but there is still not enough color on 
the queries and too much focus on a premature data model.
Is this 5000 readings for a single sensor of a single sensor unit, or for all 
sensors of a specified unit, or... both?
I presume you want "next" and "previous" 5000 readings as well as first and 
last, but... you will have to confirm that.
One technique is to store the bulk of your raw sensor data in a separate table 
and then simply store the PK of that data in your time series. That way you can 
have a much wider row of time series (number of rows) without hitting a bulk 
size issue for the partition. But... I don't want to jump to solutions until we 
have a firmer handle on the query side of the fence.
-- Jack Krupansky
On Fri, Mar 11, 2016 at 5:37 PM, Jason Kania <jason.ka...@ymail.com> wrote:

Jack,
Thanks for the response.
We are targeting our database design to 1 sensor units and each sensor unit 
has 32 sensors. We are seeing about 700 events per day per sensor, each 
providing about 2K of data. Based on keeping each partition to about 10 Mb 
(based on readings we saw on performance), we chose to break our partitions on 
a weekly basis. This is possibly finer than we need as we were seeing timeouts 
only once a single partition was about 150Mb in size

When pulling in data, we will typically need to pull 1 to 4 months of data for 
our analysis and will use only the sensorUnitId and sensorId to uniquely 
identify the data source with the timeShard value used to break up our 
partitions. We have handling to sequentially scan based on our "timeShard" 
value, but don't have a good handle on the determination of the "timeShard" 
portion of the partition key at read time. The data starts coming in when a 
subscriber starts using our system and finishes when they discontinue service 
or put the service on hold temporarily.

When I talk about hotspots, it isn't the time series data that is the concern, 
it is with respect to storing the maximum and minimum timeShard values in 
another table for subsequent lookup or the cost of running the current 
implementation of SELECT DISTINCT. We need to run queries such as getting the 
first or last 5000 sensor readings when we don't know the time frame at which 
they occurred so cannot directly supply the timeShard portion of our partition 
key.

I appreciate your input,
Thanks,
Jason

  From: Jack Krupansky <jack.krupan...@gmail.com>
 To: "user@cassandra.apache.org" <user@cassandra.apache.org> 
 Sent: Friday, March 11, 2016 4:45 PM
 Subject: Re: Strategy for dividing wide rows beyond just adding to the 
partition

Re: Strategy for dividing wide rows beyond just adding to the partition key

2016-03-11 Thread Jason Kania
Jack,
Thanks for the response.
We are targeting our database design to 1 sensor units and each sensor unit 
has 32 sensors. We are seeing about 700 events per day per sensor, each 
providing about 2K of data. Based on keeping each partition to about 10 Mb 
(based on readings we saw on performance), we chose to break our partitions on 
a weekly basis. This is possibly finer than we need as we were seeing timeouts 
only once a single partition was about 150Mb in size

When pulling in data, we will typically need to pull 1 to 4 months of data for 
our analysis and will use only the sensorUnitId and sensorId to uniquely 
identify the data source with the timeShard value used to break up our 
partitions. We have handling to sequentially scan based on our "timeShard" 
value, but don't have a good handle on the determination of the "timeShard" 
portion of the partition key at read time. The data starts coming in when a 
subscriber starts using our system and finishes when they discontinue service 
or put the service on hold temporarily.

When I talk about hotspots, it isn't the time series data that is the concern, 
it is with respect to storing the maximum and minimum timeShard values in 
another table for subsequent lookup or the cost of running the current 
implementation of SELECT DISTINCT. We need to run queries such as getting the 
first or last 5000 sensor readings when we don't know the time frame at which 
they occurred so cannot directly supply the timeShard portion of our partition 
key.

I appreciate your input,
Thanks,
Jason

  From: Jack Krupansky <jack.krupan...@gmail.com>
 To: "user@cassandra.apache.org" <user@cassandra.apache.org> 
 Sent: Friday, March 11, 2016 4:45 PM
 Subject: Re: Strategy for dividing wide rows beyond just adding to the 
partition key
   
I'll stay away from advising on a specific schema per se, but I'll stick to the 
advice that you need to make sure that your queries are depending solely on the 
columns of the primary key or relatively short slices/scans, rather than run 
the risk of very long scans or having to process multiple partitions for a 
single query. That's canned to some extent, but still essential.
Of course we generally wish to avoid hotspots, but with time series they are 
unavoidable. I mean, sure you could place successive events at separate 
partitions, but then you can't do any kind of scanning/slicing.
But, events for separate sensors are not true hotspots in the traditional sense 
- unless you have only a single sensor/unit.
After considering your queries, the next step is to consider the cardinality of 
your data - how many sensors, how many units, rate of events, etc. That will 
feedback into queries as well, such as how big a slice or scan might be, as 
well as sizing of partitions.
So, how many sensor units do you expect, how many sensors per unit, and 
expected rate of events per sensor?
Try not to jump too quickly to specific solutions - there really is a method to 
understanding all of this other stuff upfront.
-- Jack Krupansky
On Thu, Mar 10, 2016 at 12:39 PM, Jason Kania <jason.ka...@ymail.com> wrote:

Jack,
Thanks for the response. I don't think I provided enough information and used 
the wrong terminology as your response is more the canned advice is response to 
Cassandra antipatterns.
To make this clearer, this is what we are doing:
create table sensorReadings (sensorUnitId int,
sensorId int,time timestamp,timeShard int,
readings blob,primary key((sensorUnitId, sensorId, timeShard), time);
where timeShard is a combination of year and week of year
For known time range based queries, this works great. However, the specific 
problem is in knowing the maximum and minimum timeShard values when we want to 
select the entire range of data. Our understanding is that if we update another 
related table with the maximum and minimum timeShard value for a given 
sensorUnitId and sensorId combination, we will create a hotspot and lots of 
tombstones. If we SELECT DISTINCT, we get a huge list of partition keys for the 
table because we cannot reduce the scope with a where clause.

If there is a recommended pattern that solves this, we haven't come across it.

I hope makes the problem clearer.
Thanks,
Jason

  From: Jack Krupansky <jack.krupan...@gmail.com>
 To: user@cassandra.apache.org; Jason Kania <jason.ka...@ymail.com> 
 Sent: Thursday, March 10, 2016 10:42 AM
 Subject: Re: Strategy for dividing wide rows beyond just adding to the 
partition key
   
There is an effort underway to support wider 
rows:https://issues.apache.org/jira/browse/CASSANDRA-9754

This won't help you now though. Even with that improvement you still may need a 
more optimal data model since large-scale scanning/filtering is always a very 
bad idea with Cassandra.
The data modeling methodology for Cassandra dictates that queries drive the 
data model and that each form of query requires a separate table (

Re: Strategy for dividing wide rows beyond just adding to the partition key

2016-03-10 Thread Jason Kania
Jack,
Thanks for the response. I don't think I provided enough information and used 
the wrong terminology as your response is more the canned advice is response to 
Cassandra antipatterns.
To make this clearer, this is what we are doing:
create table sensorReadings (sensorUnitId int,
sensorId int,time timestamp,timeShard int,
readings blob,primary key((sensorUnitId, sensorId, timeShard), time);
where timeShard is a combination of year and week of year
For known time range based queries, this works great. However, the specific 
problem is in knowing the maximum and minimum timeShard values when we want to 
select the entire range of data. Our understanding is that if we update another 
related table with the maximum and minimum timeShard value for a given 
sensorUnitId and sensorId combination, we will create a hotspot and lots of 
tombstones. If we SELECT DISTINCT, we get a huge list of partition keys for the 
table because we cannot reduce the scope with a where clause.

If there is a recommended pattern that solves this, we haven't come across it.

I hope makes the problem clearer.
Thanks,
Jason

  From: Jack Krupansky <jack.krupan...@gmail.com>
 To: user@cassandra.apache.org; Jason Kania <jason.ka...@ymail.com> 
 Sent: Thursday, March 10, 2016 10:42 AM
 Subject: Re: Strategy for dividing wide rows beyond just adding to the 
partition key
   
There is an effort underway to support wider 
rows:https://issues.apache.org/jira/browse/CASSANDRA-9754

This won't help you now though. Even with that improvement you still may need a 
more optimal data model since large-scale scanning/filtering is always a very 
bad idea with Cassandra.
The data modeling methodology for Cassandra dictates that queries drive the 
data model and that each form of query requires a separate table ("query 
table.") Materialized view can automate that process for a lot of cases, but in 
any case it does sound as if some of your queries do require additional tables.
As a general proposition, Cassandra should not be used for heavy filtering - 
query tables with the filtering criteria baked into the PK is the way to go.

-- Jack Krupansky
On Thu, Mar 10, 2016 at 8:54 AM, Jason Kania <jason.ka...@ymail.com> wrote:

Hi,
We have sensor input that creates very wide rows and operations on these rows 
have started to timeout regulary. We have been trying to find a solution to 
dividing wide rows but keep hitting limitations that move the problem around 
instead of solving it.
We have a partition key consisting of a sensorUnitId and a sensorId and use a 
time field to access each column in the row. We tried adding a time based 
entry, timeShardId, to the partition key that consists of the year and week of 
year during which the reading was taken. This works for a number of queries but 
for scanning all the readings against a particular sensorUnitId and sensorId 
combination, we seem to be stuck.
We won't know the range of valid values of the timeShardId for a given 
sensorUnitId and sensorId combination so would have to write to an additional 
table to track the valid timeShardId. We suspect this would create tombstone 
accumulation problems given the number of updates required to the same row so 
haven't tried this option.

Alternatively, we hit a different bottleneck in the form of SELECT DISTINCT in 
trying to directly access the partition keys. Since SELECT DISTINCT does not 
allow for a where clause to filter on the partition key values, we have to 
filter several hundred thousand partition keys just to find those related to 
the relevant sensorUnitId and sensorId. This problem will only grow worse for 
us.

Are there any other approaches that can be suggested? We have been looking 
around, but haven't found any references beyond the initial suggestion to add 
some sort of shard id to the partition key to handle wide rows.
Thanks,
Jason




   

Re: Strategy for dividing wide rows beyond just adding to the partition key

2016-03-10 Thread Jason Kania
Hi Jonathan,

Thanks for the response. To make this clearer, this is what we are doing:
create table sensorReadings (sensorUnitId int,
sensorId int,time timestamp,timeShard int,
readings blob,primary key((sensorUnitId, sensorId, timeShard), time);
where timeShard is a combination of year and week of year
This works exactly as you mentioned when we know what time range we are 
querying.

The problem is that for those cases where we want to run through all the 
readings for all timestamps, we don't know the first and last timeShard value 
to use to constrain the query or iterate over each shard. Our understanding is 
that updating another table with the maximum or minimum timeShard values on 
every write to the above table would mean pounding a single row with updates 
and running SELECT DISTINCT pulls all partition keys.

Hopefully this is clearer.
Again, any suggestions would be appreciated.

Thanks,
Jason

  From: Jonathan Haddad <j...@jonhaddad.com>
 To: user@cassandra.apache.org; Jason Kania <jason.ka...@ymail.com> 
 Sent: Thursday, March 10, 2016 11:21 AM
 Subject: Re: Strategy for dividing wide rows beyond just adding to the 
partition key
   
Have you considered making the date (or week, or whatever, some time component) 
part of your partition key?
something like:
create table sensordata (sensor_id int,day date,ts datetime,reading int,primary 
key((sensor_id, day), ts);
Then if you know you need data by a particular date range, just issue multiple 
async queries for each day you need.
On Thu, Mar 10, 2016 at 5:57 AM Jason Kania <jason.ka...@ymail.com> wrote:

Hi,
We have sensor input that creates very wide rows and operations on these rows 
have started to timeout regulary. We have been trying to find a solution to 
dividing wide rows but keep hitting limitations that move the problem around 
instead of solving it.
We have a partition key consisting of a sensorUnitId and a sensorId and use a 
time field to access each column in the row. We tried adding a time based 
entry, timeShardId, to the partition key that consists of the year and week of 
year during which the reading was taken. This works for a number of queries but 
for scanning all the readings against a particular sensorUnitId and sensorId 
combination, we seem to be stuck.
We won't know the range of valid values of the timeShardId for a given 
sensorUnitId and sensorId combination so would have to write to an additional 
table to track the valid timeShardId. We suspect this would create tombstone 
accumulation problems given the number of updates required to the same row so 
haven't tried this option.

Alternatively, we hit a different bottleneck in the form of SELECT DISTINCT in 
trying to directly access the partition keys. Since SELECT DISTINCT does not 
allow for a where clause to filter on the partition key values, we have to 
filter several hundred thousand partition keys just to find those related to 
the relevant sensorUnitId and sensorId. This problem will only grow worse for 
us.

Are there any other approaches that can be suggested? We have been looking 
around, but haven't found any references beyond the initial suggestion to add 
some sort of shard id to the partition key to handle wide rows.
Thanks,
Jason



   

Strategy for dividing wide rows beyond just adding to the partition key

2016-03-10 Thread Jason Kania
Hi,
We have sensor input that creates very wide rows and operations on these rows 
have started to timeout regulary. We have been trying to find a solution to 
dividing wide rows but keep hitting limitations that move the problem around 
instead of solving it.
We have a partition key consisting of a sensorUnitId and a sensorId and use a 
time field to access each column in the row. We tried adding a time based 
entry, timeShardId, to the partition key that consists of the year and week of 
year during which the reading was taken. This works for a number of queries but 
for scanning all the readings against a particular sensorUnitId and sensorId 
combination, we seem to be stuck.
We won't know the range of valid values of the timeShardId for a given 
sensorUnitId and sensorId combination so would have to write to an additional 
table to track the valid timeShardId. We suspect this would create tombstone 
accumulation problems given the number of updates required to the same row so 
haven't tried this option.

Alternatively, we hit a different bottleneck in the form of SELECT DISTINCT in 
trying to directly access the partition keys. Since SELECT DISTINCT does not 
allow for a where clause to filter on the partition key values, we have to 
filter several hundred thousand partition keys just to find those related to 
the relevant sensorUnitId and sensorId. This problem will only grow worse for 
us.

Are there any other approaches that can be suggested? We have been looking 
around, but haven't found any references beyond the initial suggestion to add 
some sort of shard id to the partition key to handle wide rows.
Thanks,
Jason


Re: How to complete bootstrap with exception due to stream failure?

2016-02-28 Thread Jason Kania
Thanks for the reference to nodetool resetlocalschema as that will come in 
handy in the future. Thanks also for the reference to 
https://issues.apache.org/jira/browse/CASSANDRA-11050 which seems related, but 
I am not sure.

I was doing a bootstrapping on 192.168.10.10 and it had nothing on it to start 
with it. It was in the process of transferring the schema definitions that the 
bootstrap was failing. In the process of trying to get something working, I 
tried adding the dropped columns on the existing node and the new node but had 
no luck with that either.
I finally figured it out so I raised 
https://issues.apache.org/jira/browse/CASSANDRA-11273 with these details and 
the workaround that I found.
  From: Paulo Motta <pauloricard...@gmail.com>
 To: "user@cassandra.apache.org" <user@cassandra.apache.org>; Jason Kania 
<jason.ka...@ymail.com> 
 Sent: Sunday, February 28, 2016 10:01 PM
 Subject: Re: How to complete bootstrap with exception due to stream failure?
   
Were the columns sensor.lastEvaluation and sensordb.lastCheckTime dropped by 
any chance? If so, you might be hitting 
https://issues.apache.org/jira/browse/CASSANDRA-11050, fixed in upcoming 3.4.

If that's the case, you may want to check if nodes other than 192.168.10.10 
have the dropped columns in the system_schema.dropped_columns table, and if so, 
reset the local schema (nodetool resetlocalschema) of 192.168.10.10 to force a 
schema synchronization with other nodes. Another possible workaround is to 
manually include the dropped columns in the system_schema.dropped_columns table 
of 192.168.10.10.

2016-02-27 22:56 GMT-03:00 Jason Kania <jason.ka...@ymail.com>:

Hi,
I just reran the command and collected following. Any suggestions would be 
appreciated.

Thanks,
Jason

from 192.168.10.8

ERROR [STREAM-IN-/192.168.10.10] 2016-02-27 20:37:53,857 StreamSession.java:635 
- [Stream #c9868f90-ddbb-11e5-80c0-89f591237aca] Remote peer 192.168.10.10 
failed stream session.
INFO  [STREAM-IN-/192.168.10.10] 2016-02-27 20:37:53,857 
StreamResultFuture.java:182 - [Stream #c9868f90-ddbb-11e5-80c0-89f591237aca] 
Session with /192.168.10.10 is complete
WARN  [STREAM-IN-/192.168.10.10] 2016-02-27 20:37:53,858 
StreamResultFuture.java:209 - [Stream #c9868f90-ddbb-11e5-80c0-89f591237aca] 
Stream failed

from 192.168.10.8 debug
DEBUG [STREAM-IN-/192.168.10.10] 2016-02-27 20:37:53,414 
ConnectionHandler.java:262 - [Stream #c9868f90-ddbb-11e5-80c0-89f591237aca] 
Received Received (79256340--11e5-9f70-7d76a8de8480, #0)
DEBUG [STREAM-IN-/192.168.10.10] 2016-02-27 20:37:53,854 
ConnectionHandler.java:262 - [Stream #c9868f90-ddbb-11e5-80c0-89f591237aca] 
Received Retry (f3a137e0-024b-11e5-bb31-0d2316086bf7, #0)
DEBUG [STREAM-OUT-/192.168.10.10] 2016-02-27 20:37:53,854 
ConnectionHandler.java:334 - [Stream #c9868f90-ddbb-11e5-80c0-89f591237aca] 
Sending File (Header (cfId: f3a137e0-024b-11e5-bb31-0d2316086bf7, #0, version: 
ma, format: BIG, estimated keys: 128, transfer size: 4653, compressed?: true, 
repairedAt: 0, level: 0), file: 
/home/cassandra/data/sensordb/sensor/ma-76-big-Data.db)
DEBUG [STREAM-OUT-/192.168.10.10] 2016-02-27 20:37:53,854 
CompressedStreamWriter.java:63 - [Stream #c9868f90-ddbb-11e5-80c0-89f591237aca] 
Start streaming file /home/cassandra/data/sensordb/sensor/ma-76-big-Data.db to 
/192.168.10.10, repairedAt = 0, totalSize = 4653
DEBUG [STREAM-OUT-/192.168.10.10] 2016-02-27 20:37:53,854 
CompressedStreamWriter.java:94 - [Stream #c9868f90-ddbb-11e5-80c0-89f591237aca] 
Finished streaming file /home/cassandra/data/sensordb/sensor/ma-76-big-Data.db 
to /192.168.10.10, bytesTransferred = 4653, totalSize = 4653
DEBUG [STREAM-IN-/192.168.10.10] 2016-02-27 20:37:53,855 
ConnectionHandler.java:262 - [Stream #c9868f90-ddbb-11e5-80c0-89f591237aca] 
Received Retry (faa55490-024b-11e5-bb31-0d2316086bf7, #0)
DEBUG [STREAM-OUT-/192.168.10.10] 2016-02-27 20:37:53,855 
ConnectionHandler.java:334 - [Stream #c9868f90-ddbb-11e5-80c0-89f591237aca] 
Sending File (Header (cfId: faa55490-024b-11e5-bb31-0d2316086bf7, #0, version: 
ma, format: BIG, estimated keys: 128, transfer size: 705, compressed?: true, 
repairedAt: 0, level: 0), file: 
/home/cassandra/data/sensordb/sensorUnit/ma-79-big-Data.db)
DEBUG [STREAM-OUT-/192.168.10.10] 2016-02-27 20:37:53,856 
CompressedStreamWriter.java:63 - [Stream #c9868f90-ddbb-11e5-80c0-89f591237aca] 
Start streaming file /home/cassandra/data/sensordb/sensorUnit/ma-79-big-Data.db 
to /192.168.10.10, repairedAt = 0, totalSize = 705
DEBUG [STREAM-OUT-/192.168.10.10] 2016-02-27 20:37:53,856 
CompressedStreamWriter.java:94 - [Stream #c9868f90-ddbb-11e5-80c0-89f591237aca] 
Finished streaming file 
/home/cassandra/data/sensordb/sensorUnit/ma-79-big-Data.db to /192.168.10.10, 
bytesTransferred = 705, totalSize = 705
DEBUG [STREAM-IN-/192.168.10.10] 2016-02-27 20:37:53,857 
ConnectionHandler.java:262 - [Stream #c9868f90-ddbb-11e5-80c0-89f591237aca] 
Received Session Failed
ERROR [STREAM-I

Re: How to complete bootstrap with exception due to stream failure?

2016-02-27 Thread Jason Kania
 
ConnectionHandler.java:262 - [Stream #c9868f90-ddbb-11e5-80c0-89f591237aca] 
Received File (Header (cfId: 79256340--11e5-9f70-7d76a8de8480, #0, version: 
ma, format: BIG, estimated keys: 128, transfer size: 166627, compressed?: true, 
repairedAt: 0, level: 0), file: 
/home/cassandra/data/sensordb/listAttributes-7925634011e59f707d76a8de8480/ma-32-big-Data.db)
DEBUG [STREAM-OUT-/192.168.10.8] 2016-02-27 20:37:53,412 
ConnectionHandler.java:334 - [Stream #c9868f90-ddbb-11e5-80c0-89f591237aca] 
Sending Received (79256340--11e5-9f70-7d76a8de8480, #0)
DEBUG [CompactionExecutor:3] 2016-02-27 20:37:53,833 CompactionTask.java:217 - 
Compacted (e224bef0-ddbb-11e5-80c0-89f591237aca) 4 sstables to 
[/home/cassandra/data/system_distributed/parent_repair_history-deabd734b99d3b9c92e5fd92eb5abf14/ma-5-big,]
 to level=0.  2,743,164 bytes to 685,791 (~25% of original) in 1,096ms = 
0.596735MB/s.  0 total partitions merged to 57.  Partition merge counts were 
{4:57, }
DEBUG [STREAM-IN-/192.168.10.8] 2016-02-27 20:37:53,850 
CompressedStreamReader.java:80 - [Stream #c9868f90-ddbb-11e5-80c0-89f591237aca] 
Start receiving file #0 from /192.168.10.8, repairedAt = 0, size = 4653, ks = 
'sensordb', table = 'sensor'.
WARN  [STREAM-IN-/192.168.10.8] 2016-02-27 20:37:53,851 StreamSession.java:641 
- [Stream #c9868f90-ddbb-11e5-80c0-89f591237aca] Retrying for following error
java.lang.RuntimeException: Unknown column lastEvaluation during deserialization
    at 
org.apache.cassandra.db.SerializationHeader$Component.toHeader(SerializationHeader.java:331)
 ~[apache-cassandra-3.0.3.jar:3.0.3]
    at 
org.apache.cassandra.streaming.compress.CompressedStreamReader.read(CompressedStreamReader.java:87)
 ~[apache-cassandra-3.0.3.jar:3.0.3]
    at 
org.apache.cassandra.streaming.messages.IncomingFileMessage$1.deserialize(IncomingFileMessage.java:50)
 [apache-cassandra-3.0.3.jar:3.0.3]
    at 
org.apache.cassandra.streaming.messages.IncomingFileMessage$1.deserialize(IncomingFileMessage.java:39)
 [apache-cassandra-3.0.3.jar:3.0.3]
    at 
org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:59)
 [apache-cassandra-3.0.3.jar:3.0.3]
    at 
org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:261)
 [apache-cassandra-3.0.3.jar:3.0.3]
    at java.lang.Thread.run(Thread.java:745) [na:1.8.0_74]
DEBUG [STREAM-OUT-/192.168.10.8] 2016-02-27 20:37:53,852 
ConnectionHandler.java:334 - [Stream #c9868f90-ddbb-11e5-80c0-89f591237aca] 
Sending Retry (f3a137e0-024b-11e5-bb31-0d2316086bf7, #0)
DEBUG [STREAM-IN-/192.168.10.8] 2016-02-27 20:37:53,852 
ConnectionHandler.java:262 - [Stream #c9868f90-ddbb-11e5-80c0-89f591237aca] 
Received null
DEBUG [STREAM-IN-/192.168.10.8] 2016-02-27 20:37:53,853 
CompressedStreamReader.java:80 - [Stream #c9868f90-ddbb-11e5-80c0-89f591237aca] 
Start receiving file #0 from /192.168.10.8, repairedAt = 0, size = 705, ks = 
'sensordb', table = 'sensorUnit'.
WARN  [STREAM-IN-/192.168.10.8] 2016-02-27 20:37:53,854 StreamSession.java:641 
- [Stream #c9868f90-ddbb-11e5-80c0-89f591237aca] Retrying for following error
java.lang.RuntimeException: Unknown column lastCheckTime during deserialization
    at 
org.apache.cassandra.db.SerializationHeader$Component.toHeader(SerializationHeader.java:331)
 ~[apache-cassandra-3.0.3.jar:3.0.3]
    at 
org.apache.cassandra.streaming.compress.CompressedStreamReader.read(CompressedStreamReader.java:87)
 ~[apache-cassandra-3.0.3.jar:3.0.3]
    at 
org.apache.cassandra.streaming.messages.IncomingFileMessage$1.deserialize(IncomingFileMessage.java:50)
 [apache-cassandra-3.0.3.jar:3.0.3]
    at 
org.apache.cassandra.streaming.messages.IncomingFileMessage$1.deserialize(IncomingFileMessage.java:39)
 [apache-cassandra-3.0.3.jar:3.0.3]
    at 
org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:59)
 [apache-cassandra-3.0.3.jar:3.0.3]
    at 
org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:261)
 [apache-cassandra-3.0.3.jar:3.0.3]
    at java.lang.Thread.run(Thread.java:745) [na:1.8.0_74]
DEBUG [STREAM-IN-/192.168.10.8] 2016-02-27 20:37:53,854 
ConnectionHandler.java:262 - [Stream #c9868f90-ddbb-11e5-80c0-89f591237aca] 
Received null


  From: Sebastian Estevez <sebastian.este...@datastax.com>
 To: user@cassandra.apache.org; Jason Kania <jason.ka...@ymail.com> 
 Sent: Saturday, February 27, 2016 8:24 PM
 Subject: Re: How to complete bootstrap with exception due to stream failure?
   
progress: 361% does not look right (probably a bug).

Can you check the corresponding messages on the other side of the stream? I.E. 
the system log for 192.168.10.8 around 18:02:06?
All the best,
Sebastián EstévezSolutions Architect | 954 905 8615 | 
sebastian.este...@datastax.com


DataStax is the fastest, most scalable distributed database technology, 
delivering Apache Cassandra to the world’s mos

How to complete bootstrap with exception due to stream failure?

2016-02-27 Thread Jason Kania
Hello,
I am trying to get a node bootstrapped in 3.0.3, but just at the point where 
the bootstrap process is to complete, a broken pipe exception occurs so the 
bootstrap process hangs. Once I kill the bootstrap process, I can execute 
"nodetool bootstrap resume" again and the same problem will occur just at the 
end of the bootstrap exercise. Here is the tail of the log:
[2016-02-27 18:02:05,898] received file 
/home/cassandra/data/sensordb/listedAttributes-7925634011e59f707d76a8de8480/ma-30-big-Data.db
 (progress: 357%)
[2016-02-27 18:02:06,479] received file 
/home/cassandra/data/sensordb/notification-f7e3eaa0024b11e5bb310d2316086bf7/ma-38-big-Data.db
 (progress: 361%)
[2016-02-27 18:02:06,884] session with /192.168.10.8 complete (progress: 361%)
[2016-02-27 18:02:06,886] Stream failed
I attempted to run nodetool repair, but get the following which I have been 
told indicates that the replication factor is 1:
root@bull:~# nodetool repair
[2016-02-27 18:04:55,083] Nothing to repair for keyspace 'sensordb'

Thanks,
Jason


Migrating from single node to cluster

2016-02-25 Thread Jason Kania
Hi,
I am wondering if there is any documentation on migrating from a single node 
cassandra instance to a multinode cluster? My searches have been unsuccessful 
so far and I have had no luck playing with tools due to terse output from the 
tools.

I currently use a single node having data that must be retained and I want to 
add two nodes to create a cluster. I have tried to follow the instructions at 
the link below but it is unclear if it even works to go from 1 node to 2.
https://docs.datastax.com/en/cassandra/2.1/cassandra/operations/ops_add_node_to_cluster_t.html
Almost no data has been transferred across and nodetool status is showing that 
0% of the data is owned by either node although I cannot determine what the 
percentages should be in the case that the configuration is intended for data 
redundancy.
Datacenter: datacenter1
===
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address   Load   Tokens   Owns (effective)  Host ID 
  Rack
UN  192.168.10.8  648.16 MB  256  0.0%  
5ce4f8ff-3ba4-41b2-8fd5-7d00d98c415f  rack1
UN  192.168.10.9  3.31 MB    256  0.0%  
b56f6d58-0f60-473f-b202-f43ecc7a83f5  rack1

I also looked to see if there were any tools to check whether replication is in 
progress but had no luck.

The second node is bootstrapped and nodetool repair indicates that nothing 
needs to be done.
Any suggestions on a path to take? I am at a loss.

Thanks,
Jason


Re: Reenable data access after temporarily moving data out of data directory

2016-02-24 Thread Jason Kania
Thanks for the tool reference. That will help. The second part of my question 
was whether there is a way to actually perform data repair aside from copying 
data from a replica.
Thanks,
Jason
  From: Carlos Alonso <i...@mrcalonso.com>
 To: user@cassandra.apache.org; Jason Kania <jason.ka...@ymail.com> 
 Sent: Wednesday, February 24, 2016 5:31 AM
 Subject: Re: Reenable data access after temporarily moving data out of data 
directory
   
Hi Jason
Try this: 
https://docs.datastax.com/en/cassandra/2.1/cassandra/tools/toolsRefresh.html
Carlos Alonso | Software Engineer | @calonso

On 24 February 2016 at 07:07, Jason Kania <jason.ka...@ymail.com> wrote:

Hi,
I encountered an error in Cassandra or the latest Oracle JVM that causes the 
JVM to terminate during compaction in my situation (CASSANDRA 11200). In trying 
work around the problem and access the data , I moved the data eg 
ma-NNN-big-Filter.db, ma-367-big-Data.db etc. out of the data directory and ran 
some cleanup commands which allowed the overall compactions to proceed.

Now I am wondering how I can get Cassandra to reaccess the data when it is put 
back into place. Right now, a SELECT * query on the table returns no results 
even though the files are back in place.
Also are there any tools to actually repair the data rather than copy it from a 
replica elsewhere because with the JVM error, the database JVMs are not staying 
up.

Suggestions would be appreciated.
Thanks,
Jason




  

Reenable data access after temporarily moving data out of data directory

2016-02-23 Thread Jason Kania
Hi,
I encountered an error in Cassandra or the latest Oracle JVM that causes the 
JVM to terminate during compaction in my situation (CASSANDRA 11200). In trying 
work around the problem and access the data , I moved the data eg 
ma-NNN-big-Filter.db, ma-367-big-Data.db etc. out of the data directory and ran 
some cleanup commands which allowed the overall compactions to proceed.

Now I am wondering how I can get Cassandra to reaccess the data when it is put 
back into place. Right now, a SELECT * query on the table returns no results 
even though the files are back in place.
Also are there any tools to actually repair the data rather than copy it from a 
replica elsewhere because with the JVM error, the database JVMs are not staying 
up.

Suggestions would be appreciated.
Thanks,
Jason


Comprehensive documentation on Cassandra Data modelling

2014-12-16 Thread Jason Kania
Hi,
I have been having a few exchanges with contributors to the project around what 
is possible with Cassandra and a common response that comes up when I describe 
functionality as broken or missing is that I am not modelling my data 
correctly. Unfortunately, I cannot seem to find comprehensive documentation on 
modelling with Cassandra. In particular, I am finding myself modelling by 
restriction rather than what I would like to do.

Does such documentations exist? If not, is there any effort to create such 
documentation?The DataStax documentation on data modelling is far too weak to 
be meaningful.

In particular, I am caught because:
1) I want to search on a specific column to make updates to it after further 
processing; ie I don't know its value on first insert
2) If I want to search on a column, it has to be part of the primary key3) If a 
column is part of the primary key, it cannot be edited so I have a circular 
dependency
Thanks,
Jason


Re: Comprehensive documentation on Cassandra Data modelling

2014-12-16 Thread Jason Kania
Ryan,
Thanks for the response. It offers a bit more clarity.
I think a series of blog posts with good real world examples would go a long 
way to increasing usability of Cassandra. Right now I find the process like 
going through a mine field because I only discover what is not possible after 
trying something that I would find logical and failing.

For my specific questions, the problem is that since searching is only possible 
on columns in the primary key and the primary key cannot be updated, I am not 
sure what the appropriate solution is when data exists that needs to be 
searched and then updated. What is the preferrable approach to this? Is the 
expectation to maintain a series of tables, one for each stage of data 
manipulation with its own primary key?
Thanks,
Jason
  From: Ryan Svihla rsvi...@datastax.com
 To: user@cassandra.apache.org 
 Sent: Tuesday, December 16, 2014 12:36 PM
 Subject: Re: Comprehensive documentation on Cassandra Data modelling
   
Data Modeling a distributed application could be a book unto itself. However, I 
will add, modeling by restriction is basically the entire thought process in 
Cassandra data modeling since it's a distributed hash table and a core aspect 
of that sort of application is you need to be able to quickly locate which 
server owns the data you want in the cluster (which is provided by the 
partition key).

in specific response to your questions
1) as long as you know the primary key and the column name this just works. I'm 
not sure what the problem is
2) Yes, the partition key tells you which server owns the data, otherwise you'd 
have to scan all servers to find what you're asking for.
3) I'm not sure I understand this.

To summarize, all modeling can be understood when you embrace the idea that :

   
   - Querying a single server will be faster than querying many servers
   - Multiple tables with the same data but with different partition keys is 
much easier to scale that a single table that you have to scan the whole 
cluster for your answer. 

If you accept this, you've basically got the key principle down...most other 
ideas are extensions of this, some nuance includes dealing with tombstones, 
partition size and order. and I can answer any more specifics. 

I've been meaning to write a series of blog posts on this, but as I stated, 
it's almost a book unto itself. Data modeling a distributed application 
requires a fundamental rethink of all the assumptions we've been taught for 
master/slave style databases.




On Tue, Dec 16, 2014 at 10:46 AM, Jason Kania jason.ka...@ymail.com wrote:
Hi,
I have been having a few exchanges with contributors to the project around what 
is possible with Cassandra and a common response that comes up when I describe 
functionality as broken or missing is that I am not modelling my data 
correctly. Unfortunately, I cannot seem to find comprehensive documentation on 
modelling with Cassandra. In particular, I am finding myself modelling by 
restriction rather than what I would like to do.

Does such documentations exist? If not, is there any effort to create such 
documentation?The DataStax documentation on data modelling is far too weak to 
be meaningful.

In particular, I am caught because:
1) I want to search on a specific column to make updates to it after further 
processing; ie I don't know its value on first insert
2) If I want to search on a column, it has to be part of the primary key3) If a 
column is part of the primary key, it cannot be edited so I have a circular 
dependency
Thanks,
Jason



-- 
Ryan SvihlaSolution Architect
 

DataStax is the fastest, most scalable distributed database technology, 
delivering Apache Cassandra to the world’s most innovative enterprises. 
Datastax is built to be agile, always-on, and predictably scalable to any size. 
With more than 500 customers in 45 countries, DataStax is the database 
technology and transactional backbone of choice for the worlds most innovative 
companies such as Netflix, Adobe, Intuit, and eBay. 


  

Access to locally partitioned data

2014-12-14 Thread Jason Kania
Hello,

I am wondering if there is a way to obtain results from a table where only the 
results from the local partition are returned in the query?

To give some background, my application requires millions of timers and since 
queue-like implementations are a bad fit/anti-pattern for Cassandra, I am 
moving to an in-memory system to manage these timers. However, I would like to 
partition the timers such that:

1) related DB queries using the same partitioning key are most likely handled 
locally to minimize traffic as these timers are short duration in nature
2) there is no need to manage multiple partitioning schemes for the same data 
as the cluster grows

In all other respects Cassandra is one of the best databases for my needs as I 
am using it for time series data.

Thanks,

Jason