Re: Crash with TombstoneOverwhelmingException
Agree with Robert about the dogfood. http://www.datastax.com/docs/datastax_enterprise3.2/dse_release_notes#rn-3-2-4 It may be a good indicator when DSE starts using Cassandra 2.x.y in production. > From: Robert Coli > Date: Mon, Dec 30, 2013 at 2:58 PM > Subject: Re: Crash with TombstoneOverwhelmingException > To: "user@cassandra.apache.org" > > > On Wed, Dec 25, 2013 at 10:01 AM, Edward Capriolo > wrote: > >> I have to hijack this thread. There seem to be many problems with the >> 2.0.3 release. >> > > +1. There is no 2.0.x release I consider production ready, even after > today's 2.0.4. > > Outside of passing all unit tests, factors into the release voting process? >> What other type of extended real world testing should be done to find >> bugs like this one that unit testing wont? >> > > I also +1 these questions. Voting seems of limited use given the outputs > of the process. > >> >> Here is a whack y idea that I am half serious about. Make a CMS for >> http://cassndra.apache.org that back ends it's data and reporting into >> cassandra. No release unless Cassanda db that servers the site is upgraded >> first. :) >> > > I agree wholeheartedly that eating ones own dogfood is informative. > > =Rob > > >
Re: Crash with TombstoneOverwhelmingException
With cassandra an update is equivalent to an insert Cyril Scetbon > Le 14 janv. 2014 à 08:38, David Tinker a écrit : > > We never delete rows but we do a lot of updates. Is that where the > tombstones are coming from?
Re: Crash with TombstoneOverwhelmingException
We are seeing the exact same exception in our logs. Is there any workaround? We never delete rows but we do a lot of updates. Is that where the tombstones are coming from? On Wed, Dec 25, 2013 at 5:24 PM, Sanjeeth Kumar wrote: > Hi all, > One of my cassandra nodes crashes with the following exception > periodically - > ERROR [HintedHandoff:33] 2013-12-25 20:29:22,276 SliceQueryFilter.java (line > 200) Scanned over 10 tombstones; query aborted (see tombstone_fail_thr > eshold) > ERROR [HintedHandoff:33] 2013-12-25 20:29:22,278 CassandraDaemon.java (line > 187) Exception in thread Thread[HintedHandoff:33,1,main] > org.apache.cassandra.db.filter.TombstoneOverwhelmingException > at > org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:201) > at > org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:122) > at > org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:80) > at > org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:72) > at > org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:297) > at > org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:53) > at > org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1487) > at > org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1306) > at > org.apache.cassandra.db.HintedHandOffManager.doDeliverHintsToEndpoint(HintedHandOffManager.java:351) > at > org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:309) > at > org.apache.cassandra.db.HintedHandOffManager.access$300(HintedHandOffManager.java:92) > at > org.apache.cassandra.db.HintedHandOffManager$4.run(HintedHandOffManager.java:530) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > > Why does this happen? Does this relate to any incorrect config value? > > The Cassandra Version I'm running is > ReleaseVersion: 2.0.3 > > - Sanjeeth > -- http://qdb.io/ Persistent Message Queues With Replay and #RabbitMQ Integration
Re: Crash with TombstoneOverwhelmingException
On Wed, Dec 25, 2013 at 10:01 AM, Edward Capriolo wrote: > I have to hijack this thread. There seem to be many problems with the > 2.0.3 release. > +1. There is no 2.0.x release I consider production ready, even after today's 2.0.4. Outside of passing all unit tests, factors into the release voting process? > What other type of extended real world testing should be done to find bugs > like this one that unit testing wont? > I also +1 these questions. Voting seems of limited use given the outputs of the process. > > Here is a whack y idea that I am half serious about. Make a CMS for > http://cassndra.apache.org that back ends it's data and reporting into > cassandra. No release unless Cassanda db that servers the site is upgraded > first. :) > I agree wholeheartedly that eating ones own dogfood is informative. =Rob
Re: Crash with TombstoneOverwhelmingException
You can read the comments about this new feature here : https://issues.apache.org/jira/browse/CASSANDRA-6117 2013/12/27 Kais Ahmed > This threshold is to prevent bad performance, you can increase the value > > > 2013/12/27 Sanjeeth Kumar > >> Thanks for the replies. >> I dont think this is just a warning , incorrectly logged as an error. >> Everytime there is a crash, this is the exact traceback I see in the logs. >> I just browsed through the code and the code throws >> a TombstoneOverwhelmingException exception in these situations and I did >> not see this being caught and handled some place. I might be wrong though. >> >> But I would also like to understand why this threshold value is important >> , so that I can set a right threshold. >> >> - Sanjeeth >> >> >> On Fri, Dec 27, 2013 at 11:33 AM, Edward Capriolo >> wrote: >> >>> I do not think the feature is supposed to crash the server. It could be >>> that the message is the logs and the crash is not related to this message. >>> WARN might be a better logging level for any message, even though the first >>> threshold is WARN and the second is FAIL. ERROR is usually something more >>> dramatic. >>> >>> >>> On Wed, Dec 25, 2013 at 1:02 PM, Laing, Michael < >>> michael.la...@nytimes.com> wrote: >>> It's a feature: In the stock cassandra.yaml file for 2.03 see: # When executing a scan, within or across a partition, we need to keep > the > # tombstones seen in memory so we can return them to the coordinator, > which > # will use them to make sure other replicas also know about the > deleted rows. > # With workloads that generate a lot of tombstones, this can cause > performance > # problems and even exaust the server heap. > # ( > http://www.datastax.com/dev/blog/cassandra-anti-patterns-queues-and-queue-like-datasets > ) > # Adjust the thresholds here if you understand the dangers and want to > # scan more tombstones anyway. These thresholds may also be adjusted > at runtime > # using the StorageService mbean. > tombstone_warn_threshold: 1000 > tombstone_failure_threshold: 10 You are hitting the failure threshold. ml On Wed, Dec 25, 2013 at 12:17 PM, Rahul Menon wrote: > Sanjeeth, > > Looks like the error is being populated from the hintedhandoff, what > is the size of your hints cf? > > Thanks > Rahul > > > On Wed, Dec 25, 2013 at 8:54 PM, Sanjeeth Kumar wrote: > >> Hi all, >> One of my cassandra nodes crashes with the following exception >> periodically - >> ERROR [HintedHandoff:33] 2013-12-25 20:29:22,276 >> SliceQueryFilter.java (line 200) Scanned over 10 tombstones; query >> aborted (see tombstone_fail_thr >> eshold) >> ERROR [HintedHandoff:33] 2013-12-25 20:29:22,278 CassandraDaemon.java >> (line 187) Exception in thread Thread[HintedHandoff:33,1,main] >> org.apache.cassandra.db.filter.TombstoneOverwhelmingException >> at >> org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:201) >> at >> org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:122) >> at >> org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:80) >> at >> org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:72) >> at >> org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:297) >> at >> org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:53) >> at >> org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1487) >> at >> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1306) >> at >> org.apache.cassandra.db.HintedHandOffManager.doDeliverHintsToEndpoint(HintedHandOffManager.java:351) >> at >> org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:309) >> at >> org.apache.cassandra.db.HintedHandOffManager.access$300(HintedHandOffManager.java:92) >> at >> org.apache.cassandra.db.HintedHandOffManager$4.run(HintedHandOffManager.java:530) >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> at java.lang.Thread.run(Thread.java:744) >> >> Why does this happen? Does this relate to any incorrect config value? >> >> The Cassandra Version I'm running is >> ReleaseVersion: 2.0.3 >> >> - Sanjeeth >> >> > >>> >> >
Re: Crash with TombstoneOverwhelmingException
This threshold is to prevent bad performance, you can increase the value 2013/12/27 Sanjeeth Kumar > Thanks for the replies. > I dont think this is just a warning , incorrectly logged as an error. > Everytime there is a crash, this is the exact traceback I see in the logs. > I just browsed through the code and the code throws > a TombstoneOverwhelmingException exception in these situations and I did > not see this being caught and handled some place. I might be wrong though. > > But I would also like to understand why this threshold value is important > , so that I can set a right threshold. > > - Sanjeeth > > > On Fri, Dec 27, 2013 at 11:33 AM, Edward Capriolo > wrote: > >> I do not think the feature is supposed to crash the server. It could be >> that the message is the logs and the crash is not related to this message. >> WARN might be a better logging level for any message, even though the first >> threshold is WARN and the second is FAIL. ERROR is usually something more >> dramatic. >> >> >> On Wed, Dec 25, 2013 at 1:02 PM, Laing, Michael < >> michael.la...@nytimes.com> wrote: >> >>> It's a feature: >>> >>> In the stock cassandra.yaml file for 2.03 see: >>> >>> # When executing a scan, within or across a partition, we need to keep the # tombstones seen in memory so we can return them to the coordinator, which # will use them to make sure other replicas also know about the deleted rows. # With workloads that generate a lot of tombstones, this can cause performance # problems and even exaust the server heap. # ( http://www.datastax.com/dev/blog/cassandra-anti-patterns-queues-and-queue-like-datasets ) # Adjust the thresholds here if you understand the dangers and want to # scan more tombstones anyway. These thresholds may also be adjusted at runtime # using the StorageService mbean. tombstone_warn_threshold: 1000 tombstone_failure_threshold: 10 >>> >>> >>> You are hitting the failure threshold. >>> >>> ml >>> >>> >>> On Wed, Dec 25, 2013 at 12:17 PM, Rahul Menon wrote: >>> Sanjeeth, Looks like the error is being populated from the hintedhandoff, what is the size of your hints cf? Thanks Rahul On Wed, Dec 25, 2013 at 8:54 PM, Sanjeeth Kumar wrote: > Hi all, > One of my cassandra nodes crashes with the following exception > periodically - > ERROR [HintedHandoff:33] 2013-12-25 20:29:22,276 SliceQueryFilter.java > (line 200) Scanned over 10 tombstones; query aborted (see > tombstone_fail_thr > eshold) > ERROR [HintedHandoff:33] 2013-12-25 20:29:22,278 CassandraDaemon.java > (line 187) Exception in thread Thread[HintedHandoff:33,1,main] > org.apache.cassandra.db.filter.TombstoneOverwhelmingException > at > org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:201) > at > org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:122) > at > org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:80) > at > org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:72) > at > org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:297) > at > org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:53) > at > org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1487) > at > org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1306) > at > org.apache.cassandra.db.HintedHandOffManager.doDeliverHintsToEndpoint(HintedHandOffManager.java:351) > at > org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:309) > at > org.apache.cassandra.db.HintedHandOffManager.access$300(HintedHandOffManager.java:92) > at > org.apache.cassandra.db.HintedHandOffManager$4.run(HintedHandOffManager.java:530) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > > Why does this happen? Does this relate to any incorrect config value? > > The Cassandra Version I'm running is > ReleaseVersion: 2.0.3 > > - Sanjeeth > > >>> >> >
Re: Crash with TombstoneOverwhelmingException
Thanks for the replies. I dont think this is just a warning , incorrectly logged as an error. Everytime there is a crash, this is the exact traceback I see in the logs. I just browsed through the code and the code throws a TombstoneOverwhelmingException exception in these situations and I did not see this being caught and handled some place. I might be wrong though. But I would also like to understand why this threshold value is important , so that I can set a right threshold. - Sanjeeth On Fri, Dec 27, 2013 at 11:33 AM, Edward Capriolo wrote: > I do not think the feature is supposed to crash the server. It could be > that the message is the logs and the crash is not related to this message. > WARN might be a better logging level for any message, even though the first > threshold is WARN and the second is FAIL. ERROR is usually something more > dramatic. > > > On Wed, Dec 25, 2013 at 1:02 PM, Laing, Michael > wrote: > >> It's a feature: >> >> In the stock cassandra.yaml file for 2.03 see: >> >> # When executing a scan, within or across a partition, we need to keep >>> the >>> # tombstones seen in memory so we can return them to the coordinator, >>> which >>> # will use them to make sure other replicas also know about the deleted >>> rows. >>> # With workloads that generate a lot of tombstones, this can cause >>> performance >>> # problems and even exaust the server heap. >>> # ( >>> http://www.datastax.com/dev/blog/cassandra-anti-patterns-queues-and-queue-like-datasets >>> ) >>> # Adjust the thresholds here if you understand the dangers and want to >>> # scan more tombstones anyway. These thresholds may also be adjusted at >>> runtime >>> # using the StorageService mbean. >>> tombstone_warn_threshold: 1000 >>> tombstone_failure_threshold: 10 >> >> >> You are hitting the failure threshold. >> >> ml >> >> >> On Wed, Dec 25, 2013 at 12:17 PM, Rahul Menon wrote: >> >>> Sanjeeth, >>> >>> Looks like the error is being populated from the hintedhandoff, what is >>> the size of your hints cf? >>> >>> Thanks >>> Rahul >>> >>> >>> On Wed, Dec 25, 2013 at 8:54 PM, Sanjeeth Kumar wrote: >>> Hi all, One of my cassandra nodes crashes with the following exception periodically - ERROR [HintedHandoff:33] 2013-12-25 20:29:22,276 SliceQueryFilter.java (line 200) Scanned over 10 tombstones; query aborted (see tombstone_fail_thr eshold) ERROR [HintedHandoff:33] 2013-12-25 20:29:22,278 CassandraDaemon.java (line 187) Exception in thread Thread[HintedHandoff:33,1,main] org.apache.cassandra.db.filter.TombstoneOverwhelmingException at org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:201) at org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:122) at org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:80) at org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:72) at org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:297) at org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:53) at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1487) at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1306) at org.apache.cassandra.db.HintedHandOffManager.doDeliverHintsToEndpoint(HintedHandOffManager.java:351) at org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:309) at org.apache.cassandra.db.HintedHandOffManager.access$300(HintedHandOffManager.java:92) at org.apache.cassandra.db.HintedHandOffManager$4.run(HintedHandOffManager.java:530) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) Why does this happen? Does this relate to any incorrect config value? The Cassandra Version I'm running is ReleaseVersion: 2.0.3 - Sanjeeth >>> >> >
Re: Crash with TombstoneOverwhelmingException
I do not think the feature is supposed to crash the server. It could be that the message is the logs and the crash is not related to this message. WARN might be a better logging level for any message, even though the first threshold is WARN and the second is FAIL. ERROR is usually something more dramatic. On Wed, Dec 25, 2013 at 1:02 PM, Laing, Michael wrote: > It's a feature: > > In the stock cassandra.yaml file for 2.03 see: > > # When executing a scan, within or across a partition, we need to keep the >> # tombstones seen in memory so we can return them to the coordinator, >> which >> # will use them to make sure other replicas also know about the deleted >> rows. >> # With workloads that generate a lot of tombstones, this can cause >> performance >> # problems and even exaust the server heap. >> # ( >> http://www.datastax.com/dev/blog/cassandra-anti-patterns-queues-and-queue-like-datasets >> ) >> # Adjust the thresholds here if you understand the dangers and want to >> # scan more tombstones anyway. These thresholds may also be adjusted at >> runtime >> # using the StorageService mbean. >> tombstone_warn_threshold: 1000 >> tombstone_failure_threshold: 10 > > > You are hitting the failure threshold. > > ml > > > On Wed, Dec 25, 2013 at 12:17 PM, Rahul Menon wrote: > >> Sanjeeth, >> >> Looks like the error is being populated from the hintedhandoff, what is >> the size of your hints cf? >> >> Thanks >> Rahul >> >> >> On Wed, Dec 25, 2013 at 8:54 PM, Sanjeeth Kumar wrote: >> >>> Hi all, >>> One of my cassandra nodes crashes with the following exception >>> periodically - >>> ERROR [HintedHandoff:33] 2013-12-25 20:29:22,276 SliceQueryFilter.java >>> (line 200) Scanned over 10 tombstones; query aborted (see >>> tombstone_fail_thr >>> eshold) >>> ERROR [HintedHandoff:33] 2013-12-25 20:29:22,278 CassandraDaemon.java >>> (line 187) Exception in thread Thread[HintedHandoff:33,1,main] >>> org.apache.cassandra.db.filter.TombstoneOverwhelmingException >>> at >>> org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:201) >>> at >>> org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:122) >>> at >>> org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:80) >>> at >>> org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:72) >>> at >>> org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:297) >>> at >>> org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:53) >>> at >>> org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1487) >>> at >>> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1306) >>> at >>> org.apache.cassandra.db.HintedHandOffManager.doDeliverHintsToEndpoint(HintedHandOffManager.java:351) >>> at >>> org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:309) >>> at >>> org.apache.cassandra.db.HintedHandOffManager.access$300(HintedHandOffManager.java:92) >>> at >>> org.apache.cassandra.db.HintedHandOffManager$4.run(HintedHandOffManager.java:530) >>> at >>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >>> at >>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >>> at java.lang.Thread.run(Thread.java:744) >>> >>> Why does this happen? Does this relate to any incorrect config value? >>> >>> The Cassandra Version I'm running is >>> ReleaseVersion: 2.0.3 >>> >>> - Sanjeeth >>> >>> >> >
Re: Crash with TombstoneOverwhelmingException
It's a feature: In the stock cassandra.yaml file for 2.03 see: # When executing a scan, within or across a partition, we need to keep the > # tombstones seen in memory so we can return them to the coordinator, which > # will use them to make sure other replicas also know about the deleted > rows. > # With workloads that generate a lot of tombstones, this can cause > performance > # problems and even exaust the server heap. > # ( > http://www.datastax.com/dev/blog/cassandra-anti-patterns-queues-and-queue-like-datasets > ) > # Adjust the thresholds here if you understand the dangers and want to > # scan more tombstones anyway. These thresholds may also be adjusted at > runtime > # using the StorageService mbean. > tombstone_warn_threshold: 1000 > tombstone_failure_threshold: 10 You are hitting the failure threshold. ml On Wed, Dec 25, 2013 at 12:17 PM, Rahul Menon wrote: > Sanjeeth, > > Looks like the error is being populated from the hintedhandoff, what is > the size of your hints cf? > > Thanks > Rahul > > > On Wed, Dec 25, 2013 at 8:54 PM, Sanjeeth Kumar wrote: > >> Hi all, >> One of my cassandra nodes crashes with the following exception >> periodically - >> ERROR [HintedHandoff:33] 2013-12-25 20:29:22,276 SliceQueryFilter.java >> (line 200) Scanned over 10 tombstones; query aborted (see >> tombstone_fail_thr >> eshold) >> ERROR [HintedHandoff:33] 2013-12-25 20:29:22,278 CassandraDaemon.java >> (line 187) Exception in thread Thread[HintedHandoff:33,1,main] >> org.apache.cassandra.db.filter.TombstoneOverwhelmingException >> at >> org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:201) >> at >> org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:122) >> at >> org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:80) >> at >> org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:72) >> at >> org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:297) >> at >> org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:53) >> at >> org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1487) >> at >> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1306) >> at >> org.apache.cassandra.db.HintedHandOffManager.doDeliverHintsToEndpoint(HintedHandOffManager.java:351) >> at >> org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:309) >> at >> org.apache.cassandra.db.HintedHandOffManager.access$300(HintedHandOffManager.java:92) >> at >> org.apache.cassandra.db.HintedHandOffManager$4.run(HintedHandOffManager.java:530) >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> at java.lang.Thread.run(Thread.java:744) >> >> Why does this happen? Does this relate to any incorrect config value? >> >> The Cassandra Version I'm running is >> ReleaseVersion: 2.0.3 >> >> - Sanjeeth >> >> >
Re: Crash with TombstoneOverwhelmingException
I have to hijack this thread. There seem to be many problems with the 2.0.3 release. If this exception is being generated by hinted-handoff, I could understand where it is coming from. If you have many hints and many tombstones then this new feature interacts with the hint delivery process, in a bad way. If I understand the feature correctly this feature should always be off for the hints, because the regardless of how many tombstones are in the hints this rule should not apply. I want to bring up these questions: Outside of passing all unit tests, factors into the release voting process? What other type of extended real world testing should be done to find bugs like this one that unit testing wont? Not trying to call anyone out this feature/bug. I totally understand why you would want a warning, or want to opt out of a read scanning over a massive number of tombstones, and I think it is a smart feature. But what I want more is to trust that every release is battle tested. Here is a whack y idea that I am half serious about. Make a CMS for http://cassndra.apache.org that back ends it's data and reporting into cassandra. No release unless Cassanda db that servers the site is upgraded first. :) On Wed, Dec 25, 2013 at 12:17 PM, Rahul Menon wrote: > Sanjeeth, > > Looks like the error is being populated from the hintedhandoff, what is > the size of your hints cf? > > Thanks > Rahul > > > On Wed, Dec 25, 2013 at 8:54 PM, Sanjeeth Kumar wrote: > >> Hi all, >> One of my cassandra nodes crashes with the following exception >> periodically - >> ERROR [HintedHandoff:33] 2013-12-25 20:29:22,276 SliceQueryFilter.java >> (line 200) Scanned over 10 tombstones; query aborted (see >> tombstone_fail_thr >> eshold) >> ERROR [HintedHandoff:33] 2013-12-25 20:29:22,278 CassandraDaemon.java >> (line 187) Exception in thread Thread[HintedHandoff:33,1,main] >> org.apache.cassandra.db.filter.TombstoneOverwhelmingException >> at >> org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:201) >> at >> org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:122) >> at >> org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:80) >> at >> org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:72) >> at >> org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:297) >> at >> org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:53) >> at >> org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1487) >> at >> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1306) >> at >> org.apache.cassandra.db.HintedHandOffManager.doDeliverHintsToEndpoint(HintedHandOffManager.java:351) >> at >> org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:309) >> at >> org.apache.cassandra.db.HintedHandOffManager.access$300(HintedHandOffManager.java:92) >> at >> org.apache.cassandra.db.HintedHandOffManager$4.run(HintedHandOffManager.java:530) >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> at java.lang.Thread.run(Thread.java:744) >> >> Why does this happen? Does this relate to any incorrect config value? >> >> The Cassandra Version I'm running is >> ReleaseVersion: 2.0.3 >> >> - Sanjeeth >> >> >
Re: Crash with TombstoneOverwhelmingException
Sanjeeth, Looks like the error is being populated from the hintedhandoff, what is the size of your hints cf? Thanks Rahul On Wed, Dec 25, 2013 at 8:54 PM, Sanjeeth Kumar wrote: > Hi all, > One of my cassandra nodes crashes with the following exception > periodically - > ERROR [HintedHandoff:33] 2013-12-25 20:29:22,276 SliceQueryFilter.java > (line 200) Scanned over 10 tombstones; query aborted (see > tombstone_fail_thr > eshold) > ERROR [HintedHandoff:33] 2013-12-25 20:29:22,278 CassandraDaemon.java > (line 187) Exception in thread Thread[HintedHandoff:33,1,main] > org.apache.cassandra.db.filter.TombstoneOverwhelmingException > at > org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:201) > at > org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:122) > at > org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:80) > at > org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:72) > at > org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:297) > at > org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:53) > at > org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1487) > at > org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1306) > at > org.apache.cassandra.db.HintedHandOffManager.doDeliverHintsToEndpoint(HintedHandOffManager.java:351) > at > org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:309) > at > org.apache.cassandra.db.HintedHandOffManager.access$300(HintedHandOffManager.java:92) > at > org.apache.cassandra.db.HintedHandOffManager$4.run(HintedHandOffManager.java:530) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > > Why does this happen? Does this relate to any incorrect config value? > > The Cassandra Version I'm running is > ReleaseVersion: 2.0.3 > > - Sanjeeth > >
Crash with TombstoneOverwhelmingException
Hi all, One of my cassandra nodes crashes with the following exception periodically - ERROR [HintedHandoff:33] 2013-12-25 20:29:22,276 SliceQueryFilter.java (line 200) Scanned over 10 tombstones; query aborted (see tombstone_fail_thr eshold) ERROR [HintedHandoff:33] 2013-12-25 20:29:22,278 CassandraDaemon.java (line 187) Exception in thread Thread[HintedHandoff:33,1,main] org.apache.cassandra.db.filter.TombstoneOverwhelmingException at org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:201) at org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:122) at org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:80) at org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:72) at org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:297) at org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:53) at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1487) at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1306) at org.apache.cassandra.db.HintedHandOffManager.doDeliverHintsToEndpoint(HintedHandOffManager.java:351) at org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:309) at org.apache.cassandra.db.HintedHandOffManager.access$300(HintedHandOffManager.java:92) at org.apache.cassandra.db.HintedHandOffManager$4.run(HintedHandOffManager.java:530) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) Why does this happen? Does this relate to any incorrect config value? The Cassandra Version I'm running is ReleaseVersion: 2.0.3 - Sanjeeth