Hi Ben, Any chance that you are running Kudu 0.9.0 instead of 0.9.1? There's a known serious bug in 0.9.0 which can cause this kind of corruption.
Assuming that you are running with replication count 3 this time, you should be able to move aside that tablet metadata file and start the server. It will recreate a new repaired replica automatically. -Todd On Mon, Jul 18, 2016 at 10:28 AM, Benjamin Kim <bbuil...@gmail.com> wrote: > During my re-population of the Kudu table, I am getting this error trying > to restart a tablet server after it went down. The job that populates this > table has been running for over a week. > > [libprotobuf ERROR google/protobuf/message_lite.cc:123] Can't parse > message of type "kudu.tablet.TabletSuperBlockPB" because it is missing > required fields: rowsets[2324].columns[15].block > F0718 17:01:26.783571 468 tablet_server_main.cc:55] Check failed: > _s.ok() Bad status: IO error: Could not init Tablet Manager: Failed to open > tablet metadata for tablet: 24637ee6f3e5440181ce3f20b1b298ba: Failed to > load tablet metadata for tablet id 24637ee6f3e5440181ce3f20b1b298ba: Could > not load tablet metadata from > /mnt/data1/kudu/data/tablet-meta/24637ee6f3e5440181ce3f20b1b298ba: Unable > to parse PB from path: > /mnt/data1/kudu/data/tablet-meta/24637ee6f3e5440181ce3f20b1b298ba > *** Check failure stack trace: *** > @ 0x7d794d google::LogMessage::Fail() > @ 0x7d984d google::LogMessage::SendToLog() > @ 0x7d7489 google::LogMessage::Flush() > @ 0x7da2ef google::LogMessageFatal::~LogMessageFatal() > @ 0x78172b (unknown) > @ 0x344d41ed5d (unknown) > @ 0x7811d1 (unknown) > > Does anyone know what this means? > > Thanks, > Ben > > > On Jul 11, 2016, at 10:47 AM, Todd Lipcon <t...@cloudera.com> wrote: > > On Mon, Jul 11, 2016 at 10:40 AM, Benjamin Kim <bbuil...@gmail.com> wrote: > >> Todd, >> >> I had it at one replica. Do I have to recreate? >> > > We don't currently have the ability to "accept data loss" on a tablet (or > set of tablets). If the machine is gone for good, then currently the only > easy way to recover is to recreate the table. If this sounds really > painful, though, maybe we can work up some kind of tool you could use to > just recreate the missing tablets (with those rows lost). > > -Todd > >> >> On Jul 11, 2016, at 10:37 AM, Todd Lipcon <t...@cloudera.com> wrote: >> >> Hey Ben, >> >> Is the table that you're querying replicated? Or was it created with only >> one replica per tablet? >> >> -Todd >> >> On Mon, Jul 11, 2016 at 10:35 AM, Benjamin Kim <b...@amobee.com> wrote: >> >>> Over the weekend, a tablet server went down. It’s not coming back up. >>> So, I decommissioned it and removed it from the cluster. Then, I restarted >>> Kudu because I was getting a timeout exception trying to do counts on the >>> table. Now, when I try again. I get the same error. >>> >>> 16/07/11 17:32:36 WARN scheduler.TaskSetManager: Lost task 468.3 in >>> stage 0.0 (TID 603, prod-dc1-datanode167.pdc1i.gradientx.com): >>> com.stumbleupon.async.TimeoutException: Timed out after 30000ms when >>> joining Deferred@712342716(state=PAUSED, result=Deferred@1765902299, >>> callback=passthrough -> scanner opened -> wakeup thread Executor task >>> launch worker-2, errback=openScanner errback -> passthrough -> wakeup >>> thread Executor task launch worker-2) >>> at com.stumbleupon.async.Deferred.doJoin(Deferred.java:1177) >>> at com.stumbleupon.async.Deferred.join(Deferred.java:1045) >>> at org.kududb.client.KuduScanner.nextRows(KuduScanner.java:57) >>> at org.kududb.spark.kudu.RowResultIteratorScala.hasNext(KuduRDD.scala:99) >>> at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327) >>> at >>> org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:88) >>> at >>> org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:86) >>> at >>> org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710) >>> at >>> org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710) >>> at >>> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) >>> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) >>> at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) >>> at >>> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) >>> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) >>> at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) >>> at >>> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73) >>> at >>> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) >>> at org.apache.spark.scheduler.Task.run(Task.scala:89) >>> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) >>> at >>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) >>> at >>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) >>> at java.lang.Thread.run(Thread.java:745) >>> >>> Does anyone know how to recover from this? >>> >>> Thanks, >>> *Benjamin Kim* >>> *Data Solutions Architect* >>> >>> [a•mo•bee] *(n.)* the company defining digital marketing. >>> >>> *Mobile: +1 818 635 2900 <%2B1%20818%20635%202900>* >>> 3250 Ocean Park Blvd, Suite 200 | Santa Monica, CA 90405 | >>> www.amobee.com >>> >>> On Jul 6, 2016, at 9:46 AM, Dan Burkert <d...@cloudera.com> wrote: >>> >>> >>> >>> On Wed, Jul 6, 2016 at 7:05 AM, Benjamin Kim <bbuil...@gmail.com> wrote: >>> >>>> Over the weekend, the row count is up to <500M. I will give it another >>>> few days to get to 1B rows. I still get consistent times ~15s for doing row >>>> counts despite the amount of data growing. >>>> >>>> On another note, I got a solicitation email from SnappyData to evaluate >>>> their product. They claim to be the “Spark Data Store” with tight >>>> integration with Spark executors. It claims to be an OLTP and OLAP system >>>> with being an in-memory data store first then to disk. After going to >>>> several Spark events, it would seem that this is the new “hot” area for >>>> vendors. They all (MemSQL, Redis, Aerospike, Datastax, etc.) claim to be >>>> the best "Spark Data Store”. I’m wondering if Kudu will become this too? >>>> With the performance I’ve seen so far, it would seem that it can be a >>>> contender. All that is needed is a hardened Spark connector package, I >>>> would think. The next evaluation I will be conducting is to see if >>>> SnappyData’s claims are valid by doing my own tests. >>>> >>> >>> It's hard to compare Kudu against any other data store without a lot of >>> analysis and thorough benchmarking, but it is certainly a goal of Kudu to >>> be a great platform for ingesting and analyzing data through Spark. Up >>> till this point most of the Spark work has been community driven, but more >>> thorough integration testing of the Spark connector is going to be a focus >>> going forward. >>> >>> - Dan >>> >>> >>> >>>> Cheers, >>>> Ben >>>> >>>> >>>> >>>> On Jun 15, 2016, at 12:47 AM, Todd Lipcon <t...@cloudera.com> wrote: >>>> >>>> Hi Benjamin, >>>> >>>> What workload are you using for benchmarks? Using spark or something >>>> more custom? rdd or data frame or SQL, etc? Maybe you can share the schema >>>> and some queries >>>> >>>> Todd >>>> >>>> Todd >>>> On Jun 15, 2016 8:10 AM, "Benjamin Kim" <bbuil...@gmail.com> wrote: >>>> >>>>> Hi Todd, >>>>> >>>>> Now that Kudu 0.9.0 is out. I have done some tests. Already, I am >>>>> impressed. Compared to HBase, read and write performance are better. Write >>>>> performance has the greatest improvement (> 4x), while read is > 1.5x. >>>>> Albeit, these are only preliminary tests. Do you know of a way to really >>>>> do >>>>> some conclusive tests? I want to see if I can match your results on my 50 >>>>> node cluster. >>>>> >>>>> Thanks, >>>>> Ben >>>>> >>>>> On May 30, 2016, at 10:33 AM, Todd Lipcon <t...@cloudera.com> wrote: >>>>> >>>>> On Sat, May 28, 2016 at 7:12 AM, Benjamin Kim <bbuil...@gmail.com> >>>>> wrote: >>>>> >>>>>> Todd, >>>>>> >>>>>> It sounds like Kudu can possibly top or match those numbers put out >>>>>> by Aerospike. Do you have any performance statistics published or any >>>>>> instructions as to measure them myself as good way to test? In addition, >>>>>> this will be a test using Spark, so should I wait for Kudu version 0.9.0 >>>>>> where support will be built in? >>>>>> >>>>> >>>>> We don't have a lot of benchmarks published yet, especially on the >>>>> write side. I've found that thorough cross-system benchmarks are very >>>>> difficult to do fairly and accurately, and often times users end up >>>>> misguided if they pay too much attention to them :) So, given a finite >>>>> number of developers working on Kudu, I think we've tended to spend more >>>>> time on the project itself and less time focusing on "competition". I'm >>>>> sure there are use cases where Kudu will beat out Aerospike, and probably >>>>> use cases where Aerospike will beat Kudu as well. >>>>> >>>>> From my perspective, it would be great if you can share some details >>>>> of your workload, especially if there are some areas you're finding Kudu >>>>> lacking. Maybe we can spot some easy code changes we could make to improve >>>>> performance, or suggest a tuning variable you could change. >>>>> >>>>> -Todd >>>>> >>>>> >>>>>> On May 27, 2016, at 9:19 PM, Todd Lipcon <t...@cloudera.com> wrote: >>>>>> >>>>>> On Fri, May 27, 2016 at 8:20 PM, Benjamin Kim <bbuil...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> Hi Mike, >>>>>>> >>>>>>> First of all, thanks for the link. It looks like an interesting >>>>>>> read. I checked that Aerospike is currently at version 3.8.2.3, and in >>>>>>> the >>>>>>> article, they are evaluating version 3.5.4. The main thing that >>>>>>> impressed >>>>>>> me was their claim that they can beat Cassandra and HBase by 8x for >>>>>>> writing >>>>>>> and 25x for reading. Their big claim to fame is that Aerospike can >>>>>>> write 1M >>>>>>> records per second with only 50 nodes. I wanted to see if this is real. >>>>>>> >>>>>> >>>>>> 1M records per second on 50 nodes is pretty doable by Kudu as well, >>>>>> depending on the size of your records and the insertion order. I've been >>>>>> playing with a ~70 node cluster recently and seen 1M+ writes/second >>>>>> sustained, and bursting above 4M. These are 1KB rows with 11 columns, and >>>>>> with pretty old HDD-only nodes. I think newer flash-based nodes could do >>>>>> better. >>>>>> >>>>>> >>>>>>> >>>>>>> To answer your questions, we have a DMP with user profiles with many >>>>>>> attributes. We create segmentation information off of these attributes >>>>>>> to >>>>>>> classify them. Then, we can target advertising appropriately for our >>>>>>> sales >>>>>>> department. Much of the data processing is for applying models on all >>>>>>> or if >>>>>>> not most of every profile’s attributes to find similarities (nearest >>>>>>> neighbor/clustering) over a large number of rows when batch processing >>>>>>> or a >>>>>>> small subset of rows for quick online scoring. So, our use case is a >>>>>>> typical advanced analytics scenario. We have tried HBase, but it doesn’t >>>>>>> work well for these types of analytics. >>>>>>> >>>>>>> I read, that Aerospike in the release notes, they did do many >>>>>>> improvements for batch and scan operations. >>>>>>> >>>>>>> I wonder what your thoughts are for using Kudu for this. >>>>>>> >>>>>> >>>>>> Sounds like a good Kudu use case to me. I've heard great things about >>>>>> Aerospike for the low latency random access portion, but I've also heard >>>>>> that it's _very_ expensive, and not particularly suited to the columnar >>>>>> scan workload. Lastly, I think the Apache license of Kudu is much more >>>>>> appealing than the AGPL3 used by Aerospike. But, that's not really a >>>>>> direct >>>>>> answer to the performance question :) >>>>>> >>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> Ben >>>>>>> >>>>>>> >>>>>>> On May 27, 2016, at 6:21 PM, Mike Percy <mpe...@cloudera.com> wrote: >>>>>>> >>>>>>> Have you considered whether you have a scan heavy or a random access >>>>>>> heavy workload? Have you considered whether you always access / update a >>>>>>> whole row vs only a partial row? Kudu is a column store so has some >>>>>>> awesome performance characteristics when you are doing a lot of >>>>>>> scanning of >>>>>>> just a couple of columns. >>>>>>> >>>>>>> I don't know the answer to your question but if your concern is >>>>>>> performance then I would be interested in seeing comparisons from a perf >>>>>>> perspective on certain workloads. >>>>>>> >>>>>>> Finally, a year ago Aerospike did quite poorly in a Jepsen test: >>>>>>> https://aphyr.com/posts/324-jepsen-aerospike >>>>>>> >>>>>>> I wonder if they have addressed any of those issues. >>>>>>> >>>>>>> Mike >>>>>>> >>>>>>> On Friday, May 27, 2016, Benjamin Kim <bbuil...@gmail.com> wrote: >>>>>>> >>>>>>>> I am just curious. How will Kudu compare with Aerospike ( >>>>>>>> http://www.aerospike.com)? I went to a Spark Roadshow and found >>>>>>>> out about this piece of software. It appears to fit our use case >>>>>>>> perfectly >>>>>>>> since we are an ad-tech company trying to leverage our user profiles >>>>>>>> data. >>>>>>>> Plus, it already has a Spark connector and has a SQL-like client. The >>>>>>>> tables can be accessed using Spark SQL DataFrames and, also, made into >>>>>>>> SQL >>>>>>>> tables for direct use with Spark SQL ODBC/JDBC Thriftserver. I see >>>>>>>> from the >>>>>>>> work done here http://gerrit.cloudera.org:8080/#/c/2992/ that the >>>>>>>> Spark integration is well underway and, from the looks of it lately, >>>>>>>> almost >>>>>>>> complete. I would prefer to use Kudu since we are already a Cloudera >>>>>>>> shop, >>>>>>>> and Kudu is easy to deploy and configure using Cloudera Manager. I also >>>>>>>> hope that some of Aerospike’s speed optimization techniques can make it >>>>>>>> into Kudu in the future, if they have not been already thought of or >>>>>>>> included. >>>>>>>> >>>>>>>> Just some thoughts… >>>>>>>> >>>>>>>> Cheers, >>>>>>>> Ben >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> -- >>>>>>> Mike Percy >>>>>>> Software Engineer, Cloudera >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Todd Lipcon >>>>>> Software Engineer, Cloudera >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> Todd Lipcon >>>>> Software Engineer, Cloudera >>>>> >>>>> >>>>> >>>> >>> >>> >> >> >> -- >> Todd Lipcon >> Software Engineer, Cloudera >> >> >> > > > -- > Todd Lipcon > Software Engineer, Cloudera > > > -- Todd Lipcon Software Engineer, Cloudera