Re: Hintedhandoff mutation

2016-08-17 Thread Chris Lohfink
Probably question better suited for the dev@ list. But I afaik the answer
is there is no way to tell the difference, but probably safe to look at the
created time, HHs tend to be older.

Chris

On Wed, Aug 17, 2016 at 5:02 AM, Stone Fang  wrote:

> Hi All,
>
> I want to differ hintedhandoff mutation and normal write mutation when i
> receive a mutation.
>
> how to get this in cassandra source code.have not found any attribute
> about this in Mutation class.
>
> or there is no way to get this.
>
>
> thanks
> stone
>


Re: large number of pending compactions, sstables steadily increasing

2016-08-17 Thread Ezra Stuetzel
Yes, leveled compaction strategy.

Concurrent compactors were 2, I changed to 8 recently and no change. Also
at same time changed compaction throughput from 64 to to 384 mb/s. The
number of pending was still increasing after the change. Other nodes are
handling the same throughput with the previous compaction settings.

We are using c4.2xlarge in ec2. 8 vCPUs, ssds, 15GB memory.

No errors or exceptions in logs. Some possibly relevant log entries I
noticed:

INFO  [CompactionExecutor:16] 2016-08-17 19:15:04,711
> CompactionManager.java:654 - Will not compact
> /export/cassandra/data/system/batchlog-0290003c977e397cac3efdfdc01d626b/lb-961-big:
> it is not an active sstable
>
> INFO  [CompactionExecutor:16] 2016-08-17 19:15:04,711
> CompactionManager.java:654 - Will not compact
> /export/cassandra/data/system/batchlog-0290003c977e397cac3efdfdc01d626b/lb-960-big:
> it is not an active sstable
>
> INFO  [CompactionExecutor:16] 2016-08-17 19:15:04,711
> CompactionManager.java:664 - No files to compact for user defined compaction
>
WARN  [CompactionExecutor:3] 2016-08-16 19:52:07,134
> BigTableWriter.java:184 - Writing large partition
> system/hints:3b4f02ef-ac1f-4bea-9d0c-1048564b749d (150461319 bytes)

WARN  [CompactionExecutor:3] 2016-08-16 19:52:09,501
> BigTableWriter.java:184 - Writing large partition
> system/hints:3b4f02ef-ac1f-4bea-9d0c-1048564b749d (149619989 bytes)

WARN  [epollEventLoopGroup-2-2] 2016-08-16 19:52:12,911 Frame.java:203 -
> Detected connection using native protocol version 2. Both version 1 and 2
> of the native protocol are now deprecated and support will be removed in
> Cassandra 3.0. You are encouraged to upgrade to a client driver using
> version 3 of the native protocol

WARN  [GossipTasks:1] 2016-08-16 20:51:45,643 FailureDetector.java:287 -
> Not marking nodes down due to local pause of 131385662140 > 50

WARN  [CompactionExecutor:5] 2016-08-17 01:50:05,200
> MajorLeveledCompactionWriter.java:63 - Many sstables involved in
> compaction, skipping storing ancestor information to avoid running out of
> memory

WARN  [CompactionExecutor:4] 2016-08-17 01:50:48,684
> MajorLeveledCompactionWriter.java:63 - Many sstables involved in
> compaction, skipping storing ancestor information to avoid running out of
> memory

WARN  [GossipTasks:1] 2016-08-17 04:35:10,697 FailureDetector.java:287 -
> Not marking nodes down due to local pause of 8628650983 > 50

WARN  [GossipTasks:1] 2016-08-17 04:42:55,524 FailureDetector.java:287 -
> Not marking nodes down due to local pause of 9141089664 > 50




On Wed, Aug 17, 2016 at 11:49 AM, Jeff Jirsa 
wrote:

> What compaction strategy? Looks like leveled – is that what you expect?
>
>
>
> Any exceptions in the logs?
>
>
>
> Are you throttling compaction?
>
>
>
> SSD or spinning disks?
>
>
>
> How many cores?
>
>
>
> How many concurrent compactors?
>
>
>
>
>
>
>
> *From: *Ezra Stuetzel 
> *Reply-To: *"user@cassandra.apache.org" 
> *Date: *Wednesday, August 17, 2016 at 11:39 AM
> *To: *"user@cassandra.apache.org" 
> *Subject: *large number of pending compactions, sstables steadily
> increasing
>
>
>
> I have one node in my cluster 2.2.7 (just upgraded from 2.2.6 hoping to
> fix issue) which seems to be stuck in a weird state -- with a large number
> of pending compactions and sstables. The node is compacting about
> 500gb/day, number of pending compactions is going up at about 50/day. It is
> at about 2300 pending compactions now. I have tried increasing number of
> compaction threads and the compaction throughput, which doesn't seem to
> help eliminate the many pending compactions.
>
>
>
> I have tried running 'nodetool cleanup' and 'nodetool compact'. The latter
> has fixed the issue in the past, but most recently I was getting OOM
> errors, probably due to the large number of sstables. I upgraded to 2.2.7
> and am no longer getting OOM errors, but also it does not resolve the
> issue. I do see this message in the logs:
>
>
>
> INFO  [RMI TCP Connection(611)-10.9.2.218] 2016-08-17 01:50:01,985
> CompactionManager.java:610 - Cannot perform a full major compaction as
> repaired and unrepaired sstables cannot be compacted together. These two
> set of sstables will be compacted separately.
>
> Below are the 'nodetool tablestats' comparing a normal and the problematic
> node. You can see problematic node has many many more sstables, and they
> are all in level 1. What is the best way to fix this? Can I just delete
> those sstables somehow then run a repair?
>
> Normal node
>
> keyspace: mykeyspace
>
> Read Count: 0
>
> Read Latency: NaN ms.
>
> Write Count: 31905656
>
> Write Latency: 0.051713177939359714 ms.
>
> Pending Flushes: 0
>
> Table: mytable
>
> SSTable count: 1908
>
> SSTables in each level: [11/4, 20/10, 213/100, 1356/1000, 306, 0,
> 0, 0, 0]
>
> Space used (live): 

large number of pending compactions, sstables steadily increasing

2016-08-17 Thread Ezra Stuetzel
I have one node in my cluster 2.2.7 (just upgraded from 2.2.6 hoping to fix
issue) which seems to be stuck in a weird state -- with a large number of
pending compactions and sstables. The node is compacting about 500gb/day,
number of pending compactions is going up at about 50/day. It is at about
2300 pending compactions now. I have tried increasing number of compaction
threads and the compaction throughput, which doesn't seem to help eliminate
the many pending compactions.

I have tried running 'nodetool cleanup' and 'nodetool compact'. The latter
has fixed the issue in the past, but most recently I was getting OOM
errors, probably due to the large number of sstables. I upgraded to 2.2.7
and am no longer getting OOM errors, but also it does not resolve the
issue. I do see this message in the logs:

INFO  [RMI TCP Connection(611)-10.9.2.218] 2016-08-17 01:50:01,985
> CompactionManager.java:610 - Cannot perform a full major compaction as
> repaired and unrepaired sstables cannot be compacted together. These two
> set of sstables will be compacted separately.
>
Below are the 'nodetool tablestats' comparing a normal and the problematic
node. You can see problematic node has many many more sstables, and they
are all in level 1. What is the best way to fix this? Can I just delete
those sstables somehow then run a repair?

> Normal node

keyspace: mykeyspace
>
> Read Count: 0
>
> Read Latency: NaN ms.
>
> Write Count: 31905656
>
> Write Latency: 0.051713177939359714 ms.
>
> Pending Flushes: 0
>
> Table: mytable
>
> SSTable count: 1908
>
> SSTables in each level: [11/4, 20/10, 213/100, 1356/1000, 306, 0,
>> 0, 0, 0]
>
> Space used (live): 301894591442
>
> Space used (total): 301894591442
>
>
>>
>> Problematic node
>
> Keyspace: mykeyspace
>
> Read Count: 0
>
> Read Latency: NaN ms.
>
> Write Count: 30520190
>
> Write Latency: 0.05171286705620116 ms.
>
> Pending Flushes: 0
>
> Table: mytable
>
> SSTable count: 14105
>
> SSTables in each level: [13039/4, 21/10, 206/100, 831, 0, 0, 0, 0,
>> 0]
>
> Space used (live): 561143255289
>
> Space used (total): 561143255289
>
> Thanks,

Ezra


Re: migrating from 2.1.2 to 3.0.8 log errors

2016-08-17 Thread Adil
just to share with you, by running rebuild_index the problem is solved.

2016-08-11 22:05 GMT+02:00 Adil :

> After migrating C* from 2.1.2 to 3.0.8, all queries with the where
> condition involved ad indexed column return zero rows for the old data,
> instead news inserted data are returned from the same query, I'm guessing
> that something was remained incomplete about indexes, should we run rebuild
> indexes? Any idea?
> Thank
> Ad.
>
> Il 10/ago/2016 23:58, "Adil"  ha scritto:
>
>> Thank you for your response, we have updated datastax driver to 3.1.0
>> using V3 protocol, i think there are still some webapp that still using the
>> 2.1.6 java driver..we will upgrade thembut we noticed strange things,
>> on web apps upgraded to 3.1.0 some queries return zero results even if data
>> exists, I can see it with cqlsh
>>
>> 2016-08-10 20:48 GMT+02:00 Tyler Hobbs :
>>
>>> That just means that a client/driver disconnected.  Those log messages
>>> are supposed to be suppressed, but perhaps that stopped working in 3.x due
>>> to another change.
>>>
>>> On Wed, Aug 10, 2016 at 10:33 AM, Adil  wrote:
>>>
 Hi guys,
 We have migrated our cluster (5 nodes in DC1 and 5 nodes in DC2) from
 cassandra 2.1.2 to 3.0.8, all seems fine, executing nodetool status shows
 all nodes UN, but in each node's log there is this log error continuously:
 java.io.IOException: Error while read(...): Connection reset by peer
 at io.netty.channel.epoll.Native.readAddress(Native Method)
 ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
 at io.netty.channel.epoll.EpollSocketChannel$EpollSocketUnsafe.
 doReadBytes(EpollSocketChannel.java:675) ~[netty-all-4.0.23.Final.jar:4
 .0.23.Final]
 at io.netty.channel.epoll.EpollSocketChannel$EpollSocketUnsafe.
 epollInReady(EpollSocketChannel.java:714)
 ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
 at 
 io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:326)
 ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
 at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:264)
 ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
 at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(Sin
 gleThreadEventExecutor.java:116) ~[netty-all-4.0.23.Final.jar:4
 .0.23.Final]
 at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnabl
 eDecorator.run(DefaultThreadFactory.java:137)
 ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
 at java.lang.Thread.run(Thread.java:745) [na:1.8.0_101]

 we have installed java-8_101

 anya idea what woud be the problem?

 thanks

 Adil
 does anyone


>>>
>>>
>>> --
>>> Tyler Hobbs
>>> DataStax 
>>>
>>
>>


Re: sstableloader

2016-08-17 Thread Jean Tremblay
Thank you for your answer Kai.

On 17 Aug 2016, at 11:34 , Kai Wang > 
wrote:

yes, you are correct.

On Tue, Aug 16, 2016 at 2:37 PM, Jean Tremblay 
> 
wrote:
Hi,

I’m using Cassandra 3.7.

In the documentation for sstableloader I read the following:

<< Note: To get the best throughput from SSTable loading, you can use multiple 
instances of sstableloader to stream across multiple machines. No hard limit 
exists on the number of SSTables that sstableloader can run at the same time, 
so you can add additional loaders until you see no further improvement.>>

Does this mean that I can stream my sstables to my cluster from many instance 
of sstableloader running simultaneously on many client machines?

I ask because I would like to improve the transfer speed of my stables to my 
cluster.

Kind regards and thanks for your comments.

Jean




Hintedhandoff mutation

2016-08-17 Thread Stone Fang
Hi All,

I want to differ hintedhandoff mutation and normal write mutation when i
receive a mutation.

how to get this in cassandra source code.have not found any attribute about
this in Mutation class.

or there is no way to get this.


thanks
stone


Re: Corrupt SSTABLE over and over

2016-08-17 Thread Kai Wang
This might not be good news to you. But my experience is that C*
2.X/Windows is not ready for production yet. I've seen various file system
related errors. And in one of the JIRAs I was told major work (or rework)
is done in 3.X to improve C* stability on Windows.

On Tue, Aug 16, 2016 at 3:44 AM, Bryan Cheng  wrote:

> Hi Alaa,
>
> Sounds like you have problems that go beyond Cassandra- likely filesystem
> corruption or bad disks. I don't know enough about Windows to give you any
> specific advice but I'd try a run of chkdsk to start.
>
> --Bryan
>
> On Fri, Aug 12, 2016 at 5:19 PM, Alaa Zubaidi (PDF) 
> wrote:
>
>> Hi Bryan,
>>
>> Changing disk_failure_policy to best_effort, and running nodetool scrub,
>> did not work, it generated another error:
>> java.nio.file.AccessDeniedException
>>
>> Also tried to remove all files (data, commitlog, savedcaches) and restart
>> the node fresh, and still I am getting corruption.
>>
>> and Still nothing that indicate there is a HW issue?
>> All other nodes are fine
>>
>> Regards,
>> Alaa
>>
>>
>> On Fri, Aug 12, 2016 at 12:00 PM, Bryan Cheng 
>> wrote:
>>
>>> Should also add that if the scope of corruption is _very_ large, and you
>>> have a good, aggressive repair policy (read: you are confident in the
>>> consistency of the data elsewhere in the cluster), you may just want to
>>> decommission and rebuild that node.
>>>
>>> On Fri, Aug 12, 2016 at 11:55 AM, Bryan Cheng 
>>> wrote:
>>>
 Looks like you're doing the offline scrub- have you tried online?

 Here's my typical process for corrupt SSTables.

 With disk_failure_policy set to stop, examine the failing sstables. If
 they are very small (in the range of kbs), it is unlikely that there is any
 salvageable data there. Just delete them, start the machine, and schedule a
 repair ASAP.

 If they are large, then it may be worth salvaging. If the scope of
 corruption is reasonable (limited to a few sstables scattered among
 different keyspaces), set disk_failure_policy to best_effort, start the
 machine up, and run the nodetool scrub. This is online scrub, faster than
 offline scrub (at least of 2.1.12, the last time I had to do this).

 Only if all else fails, attempt the very painful offline sstablescrub.

 Is the VMWare client Windows? (Trying to make sure its not just the
 host). YMMV but in the past Windows was somewhat of a neglected platform
 wrt Cassandra. I think you'd have a lot easier time getting help if running
 Linux is an option here.



 On Fri, Aug 12, 2016 at 9:16 AM, Alaa Zubaidi (PDF) <
 alaa.zuba...@pdf.com> wrote:

> Hi Jason,
>
> Thanks for your input...
> Thats what I am afraid of?
> Did you find any HW error in the VMware and HW logs? any indication
> that the HW is the reason? I need to make sure that this is the reason
> before asking the customer to spend more money?
>
> Thanks,
> Alaa
>
> On Thu, Aug 11, 2016 at 11:02 PM, Jason Wee 
> wrote:
>
>> cassandra run on virtual server (vmware)?
>>
>> > I tried sstablescrub but it crashed with hs-err-pid-...
>> maybe try with larger heap allocated to sstablescrub
>>
>> this sstable corrupt i ran into it as well (on cassandra 1.2), first i
>> try nodetool scrub, still persist, then offline sstablescrub still
>> persist, wipe the node and it happen again, then i change the hardware
>> (disk and mem). things went good.
>>
>> hth
>>
>> jason
>>
>>
>> On Fri, Aug 12, 2016 at 9:20 AM, Alaa Zubaidi (PDF)
>>  wrote:
>> > Hi,
>> >
>> > I have a 16 Node cluster, Cassandra 2.2.1 on Windows, local
>> installation
>> > (NOT on the cloud)
>> >
>> > and I am getting
>> > Error [CompactionExecutor:2] 2016-08-12 06:51:52, 983 Cassandra
>> > Daemon.java:183 - Execption in thread Thread[CompactionExecutor:2,1m
>> ain]
>> > org.apache.cassandra.io.FSReaderError:
>> > org.apache.cassandra.io.sstable.CorruptSSTableExecption:
>> > org.apache.cassandra.io.compress.CurrptBlockException:
>> > (E:\\la-4886-big-Data.db): corruption detected, chunk at
>> 4969092 of
>> > length 10208.
>> > at
>> > org.apache.cassandra.io.util.RandomAccessReader.readBytes(Ra
>> ndomAccessReader.java:357)
>> > ~[apache-cassandra-2.2.1.jar:2.2.1]
>> > 
>> > 
>> > ERROR [CompactionExecutor:2] ... FileUtils.java:463 - Existing
>> > forcefully due to file system exception on startup, disk failure
>> policy
>> > "stop"
>> >
>> > I tried sstablescrub but it crashed with hs-err-pid-...
>> > I removed the corrupted file and started the Node again, after one
>> day the
>> > corruption came back again, 

Re: sstableloader

2016-08-17 Thread Kai Wang
yes, you are correct.

On Tue, Aug 16, 2016 at 2:37 PM, Jean Tremblay <
jean.tremb...@zen-innovations.com> wrote:

> Hi,
>
> I’m using Cassandra 3.7.
>
> In the documentation for sstableloader I read the following:
>
> << Note: To get the best throughput from SSTable loading, you can use
> multiple instances of sstableloader to stream across multiple machines. No
> hard limit exists on the number of SSTables that sstableloader can run at
> the same time, so you can add additional loaders until you see no further
> improvement.>>
>
> Does this mean that I can stream my sstables to my cluster from many
> instance of sstableloader running simultaneously on many client machines?
>
> I ask because I would like to improve the transfer speed of my stables to
> my cluster.
>
> Kind regards and thanks for your comments.
>
> Jean
>


Re: Cassandra Exception

2016-08-17 Thread Kamal C
Carlos,
Yes, I'm running multiple clients simultaneously. Each one of them tries to
create table if it doesn't exists in the cassandra.

Ali,
I've cleared the data directory. Next time, If it reoccurs, I'll follow the
steps listed and come here again.

Thanks for the information.

Regards,
Kamal C


On Tue, Aug 16, 2016 at 3:29 PM,  wrote:

> It seems that the table was created twice (concurrently). One with
> b2b1bcd0-5f94-11e6-867a-cb7bb92c id and the other with
> b2b47bf0-5f94-11e6-867a-cb7bb92c.
> If you programmatically create tables, it's generally a bad idea,
> especially with multiple clients.
>
> Anyways, execute the following and see what you have in there
> select * from system_schema.tables where id = 
> b2b47bf0-5f94-11e6-867a-cb7bb92c
> allow filtering;
> select * from system_schema.tables where id = 
> b2b1bcd0-5f94-11e6-867a-cb7bb92c
> allow filtering;
>
> Go to your data directory, and inside your keyspace, you should see your
> table followed by a uuid, how many directories of your table do you have
> there? and what's inside them? do both of them have data, or just one.
>
> -Ali H
>
>
>
> From:Kamal C 
> To:user@cassandra.apache.org,
> Date:08/16/2016 11:51 AM
> Subject:Cassandra Exception
> --
>
>
>
> Hi all,
>
> I'm using cassandra-3.7 version, JRE 1.8, centos- 6.5 in standalone
> mode. (only one cassandra node).
> Cassandra clients are in JRE - 1.7. Once in a while, I'm getting the below
> exceptions in the cassandra
> but the data (put / get) is fine. I've searched to find the root-cause but
> unable to conclude it.
>
>
> INFO  07:24:53 Create new table: org.apache.cassandra.config.
> CFMetaData@64b34445[cfId=b2b1bcd0-5f94-11e6-867a-cb7bb92c,ksName=fms,
> cfName=flapalarmcache,flags=[COMPOUND],params=TableParams{comment=,
> read_repair_chance=0.0, dclocal_read_repair_chance=0.1,
> bloom_filter_fp_chance=0.01, crc_check_chance=1.0, gc_grace_seconds=864000,
> default_time_to_live=0, memtable_flush_period_in_ms=0,
> min_index_interval=128, max_index_interval=2048, 
> speculative_retry=99PERCENTILE,
> caching={'keys' : 'ALL', 'rows_per_partition' : 'NONE'},
> compaction=CompactionParams{class=org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy,
> options={min_threshold=4, max_threshold=32}}, compression=org.apache.
> cassandra.schema.CompressionParams@baaacb37, extensions={}},comparator=
> comparator(),partitionColumns=[[] | 
> [alarm]],partitionKeyColumns=[ColumnDefinition{name=key,
> type=org.apache.cassandra.db.marshal.BytesType, kind=PARTITION_KEY,
> position=0}],clusteringColumns=[],keyValidator=org.apache.
> cassandra.db.marshal.BytesType,columnMetadata=[ColumnDefinition{name=alarm,
> type=org.apache.cassandra.db.marshal.BytesType, kind=REGULAR,
> position=-1}, ColumnDefinition{name=key, 
> type=org.apache.cassandra.db.marshal.BytesType,
> kind=PARTITION_KEY, position=0}],droppedColumns={},triggers=[],indexes=[]]
> INFO  07:24:53 Create new table: org.apache.cassandra.config.
> CFMetaData@641c948a[cfId=b2b47bf0-5f94-11e6-867a-cb7bb92c,ksName=fms,
> cfName=flapalarmcache,flags=[COMPOUND],params=TableParams{comment=,
> read_repair_chance=0.0, dclocal_read_repair_chance=0.1,
> bloom_filter_fp_chance=0.01, crc_check_chance=1.0, gc_grace_seconds=864000,
> default_time_to_live=0, memtable_flush_period_in_ms=0,
> min_index_interval=128, max_index_interval=2048, 
> speculative_retry=99PERCENTILE,
> caching={'keys' : 'ALL', 'rows_per_partition' : 'NONE'},
> compaction=CompactionParams{class=org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy,
> options={min_threshold=4, max_threshold=32}}, compression=org.apache.
> cassandra.schema.CompressionParams@baaacb37, extensions={}},comparator=
> comparator(),partitionColumns=[[] | 
> [alarm]],partitionKeyColumns=[ColumnDefinition{name=key,
> type=org.apache.cassandra.db.marshal.BytesType, kind=PARTITION_KEY,
> position=0}],clusteringColumns=[],keyValidator=org.apache.
> cassandra.db.marshal.BytesType,columnMetadata=[ColumnDefinition{name=alarm,
> type=org.apache.cassandra.db.marshal.BytesType, kind=REGULAR,
> position=-1}, ColumnDefinition{name=key, 
> type=org.apache.cassandra.db.marshal.BytesType,
> kind=PARTITION_KEY, position=0}],droppedColumns={},triggers=[],indexes=[]]
> INFO  07:24:55 Initializing fms.flapalarmcache
> ERROR 07:24:57 Exception in thread Thread[MigrationStage:1,5,main]
> org.apache.cassandra.exceptions.ConfigurationException: Column family ID
> mismatch (found b2b47bf0-5f94-11e6-867a-cb7bb92c; expected
> b2b1bcd0-5f94-11e6-867a-cb7bb92c)
> at 
> org.apache.cassandra.config.CFMetaData.validateCompatibility(CFMetaData.java:789)
> ~[apache-cassandra-3.7.jar:3.7]
> at org.apache.cassandra.config.CFMetaData.apply(CFMetaData.java:749)
> ~[apache-cassandra-3.7.jar:3.7]
> at org.apache.cassandra.config.Schema.updateTable(Schema.java:663)
>