Re: Performance Question

Benjamin Kim Mon, 18 Jul 2016 10:28:20 -0700

During my re-population of the Kudu table, I am getting this error trying to 
restart a tablet server after it went down. The job that populates this table 
has been running for over a week.


[libprotobuf ERROR google/protobuf/message_lite.cc:123] Can't parse message of 
type "kudu.tablet.TabletSuperBlockPB" because it is missing required fields: 
rowsets[2324].columns[15].block
F0718 17:01:26.783571   468 tablet_server_main.cc:55] Check failed: _s.ok() Bad 
status: IO error: Could not init Tablet Manager: Failed to open tablet metadata 
for tablet: 24637ee6f3e5440181ce3f20b1b298ba: Failed to load tablet metadata 
for tablet id 24637ee6f3e5440181ce3f20b1b298ba: Could not load tablet metadata 
from /mnt/data1/kudu/data/tablet-meta/24637ee6f3e5440181ce3f20b1b298ba: Unable 
to parse PB from path: 
/mnt/data1/kudu/data/tablet-meta/24637ee6f3e5440181ce3f20b1b298ba
*** Check failure stack trace: ***
    @           0x7d794d  google::LogMessage::Fail()
    @           0x7d984d  google::LogMessage::SendToLog()
    @           0x7d7489  google::LogMessage::Flush()
    @           0x7da2ef  google::LogMessageFatal::~LogMessageFatal()
    @           0x78172b  (unknown)
    @       0x344d41ed5d  (unknown)
    @           0x7811d1  (unknown)

Does anyone know what this means?

Thanks,
Ben


> On Jul 11, 2016, at 10:47 AM, Todd Lipcon <t...@cloudera.com> wrote:
> 
> On Mon, Jul 11, 2016 at 10:40 AM, Benjamin Kim <bbuil...@gmail.com 
> <mailto:bbuil...@gmail.com>> wrote:
> Todd,
> 
> I had it at one replica. Do I have to recreate?
> 
> We don't currently have the ability to "accept data loss" on a tablet (or set 
> of tablets). If the machine is gone for good, then currently the only easy 
> way to recover is to recreate the table. If this sounds really painful, 
> though, maybe we can work up some kind of tool you could use to just recreate 
> the missing tablets (with those rows lost).
> 
> -Todd
> 
>> On Jul 11, 2016, at 10:37 AM, Todd Lipcon <t...@cloudera.com 
>> <mailto:t...@cloudera.com>> wrote:
>> 
>> Hey Ben,
>> 
>> Is the table that you're querying replicated? Or was it created with only 
>> one replica per tablet?
>> 
>> -Todd
>> 
>> On Mon, Jul 11, 2016 at 10:35 AM, Benjamin Kim <b...@amobee.com 
>> <mailto:b...@amobee.com>> wrote:
>> Over the weekend, a tablet server went down. It’s not coming back up. So, I 
>> decommissioned it and removed it from the cluster. Then, I restarted Kudu 
>> because I was getting a timeout  exception trying to do counts on the table. 
>> Now, when I try again. I get the same error.
>> 
>> 16/07/11 17:32:36 WARN scheduler.TaskSetManager: Lost task 468.3 in stage 
>> 0.0 (TID 603, prod-dc1-datanode167.pdc1i.gradientx.com 
>> <http://prod-dc1-datanode167.pdc1i.gradientx.com/>): 
>> com.stumbleupon.async.TimeoutException: Timed out after 30000ms when joining 
>> Deferred@712342716(state=PAUSED, result=Deferred@1765902299, 
>> callback=passthrough -> scanner opened -> wakeup thread Executor task launch 
>> worker-2, errback=openScanner errback -> passthrough -> wakeup thread 
>> Executor task launch worker-2)
>> at com.stumbleupon.async.Deferred.doJoin(Deferred.java:1177)
>> at com.stumbleupon.async.Deferred.join(Deferred.java:1045)
>> at org.kududb.client.KuduScanner.nextRows(KuduScanner.java:57)
>> at org.kududb.spark.kudu.RowResultIteratorScala.hasNext(KuduRDD.scala:99)
>> at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
>> at 
>> org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:88)
>> at 
>> org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:86)
>> at 
>> org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
>> at 
>> org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
>> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>> at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>> at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>> at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
>> at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
>> at org.apache.spark.scheduler.Task.run(Task.scala:89)
>> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
>> at 
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>> at 
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>> at java.lang.Thread.run(Thread.java:745)
>> 
>> Does anyone know how to recover from this?
>> 
>> Thanks,
>> Benjamin Kim
>> Data Solutions Architect
>> 
>> [a•mo•bee] (n.) the company defining digital marketing.
>> 
>> Mobile: +1 818 635 2900 <tel:%2B1%20818%20635%202900>
>> 3250 Ocean Park Blvd, Suite 200  |  Santa Monica, CA 90405  |  
>> www.amobee.com <http://www.amobee.com/>
>>> On Jul 6, 2016, at 9:46 AM, Dan Burkert <d...@cloudera.com 
>>> <mailto:d...@cloudera.com>> wrote:
>>> 
>>> 
>>> 
>>> On Wed, Jul 6, 2016 at 7:05 AM, Benjamin Kim <bbuil...@gmail.com 
>>> <mailto:bbuil...@gmail.com>> wrote:
>>> Over the weekend, the row count is up to <500M. I will give it another few 
>>> days to get to 1B rows. I still get consistent times ~15s for doing row 
>>> counts despite the amount of data growing.
>>> 
>>> On another note, I got a solicitation email from SnappyData to evaluate 
>>> their product. They claim to be the “Spark Data Store” with tight 
>>> integration with Spark executors. It claims to be an OLTP and OLAP system 
>>> with being an in-memory data store first then to disk. After going to 
>>> several Spark events, it would seem that this is the new “hot” area for 
>>> vendors. They all (MemSQL, Redis, Aerospike, Datastax, etc.) claim to be 
>>> the best "Spark Data Store”. I’m wondering if Kudu will become this too? 
>>> With the performance I’ve seen so far, it would seem that it can be a 
>>> contender. All that is needed is a hardened Spark connector package, I 
>>> would think. The next evaluation I will be conducting is to see if 
>>> SnappyData’s claims are valid by doing my own tests.
>>> 
>>> It's hard to compare Kudu against any other data store without a lot of 
>>> analysis and thorough benchmarking, but it is certainly a goal of Kudu to 
>>> be a great platform for ingesting and analyzing data through Spark.  Up 
>>> till this point most of the Spark work has been community driven, but more 
>>> thorough integration testing of the Spark connector is going to be a focus 
>>> going forward.
>>> 
>>> - Dan
>>> 
>>>  
>>> Cheers,
>>> Ben
>>> 
>>> 
>>> 
>>>> On Jun 15, 2016, at 12:47 AM, Todd Lipcon <t...@cloudera.com 
>>>> <mailto:t...@cloudera.com>> wrote:
>>>> 
>>>> Hi Benjamin,
>>>> 
>>>> What workload are you using for benchmarks? Using spark or something more 
>>>> custom? rdd or data frame or SQL, etc? Maybe you can share the schema and 
>>>> some queries
>>>> 
>>>> Todd
>>>> 
>>>> Todd
>>>> 
>>>> On Jun 15, 2016 8:10 AM, "Benjamin Kim" <bbuil...@gmail.com 
>>>> <mailto:bbuil...@gmail.com>> wrote:
>>>> Hi Todd,
>>>> 
>>>> Now that Kudu 0.9.0 is out. I have done some tests. Already, I am 
>>>> impressed. Compared to HBase, read and write performance are better. Write 
>>>> performance has the greatest improvement (> 4x), while read is > 1.5x. 
>>>> Albeit, these are only preliminary tests. Do you know of a way to really 
>>>> do some conclusive tests? I want to see if I can match your results on my 
>>>> 50 node cluster.
>>>> 
>>>> Thanks,
>>>> Ben
>>>> 
>>>>> On May 30, 2016, at 10:33 AM, Todd Lipcon <t...@cloudera.com 
>>>>> <mailto:t...@cloudera.com>> wrote:
>>>>> 
>>>>> On Sat, May 28, 2016 at 7:12 AM, Benjamin Kim <bbuil...@gmail.com 
>>>>> <mailto:bbuil...@gmail.com>> wrote:
>>>>> Todd,
>>>>> 
>>>>> It sounds like Kudu can possibly top or match those numbers put out by 
>>>>> Aerospike. Do you have any performance statistics published or any 
>>>>> instructions as to measure them myself as good way to test? In addition, 
>>>>> this will be a test using Spark, so should I wait for Kudu version 0.9.0 
>>>>> where support will be built in?
>>>>> 
>>>>> We don't have a lot of benchmarks published yet, especially on the write 
>>>>> side. I've found that thorough cross-system benchmarks are very difficult 
>>>>> to do fairly and accurately, and often times users end up misguided if 
>>>>> they pay too much attention to them :) So, given a finite number of 
>>>>> developers working on Kudu, I think we've tended to spend more time on 
>>>>> the project itself and less time focusing on "competition". I'm sure 
>>>>> there are use cases where Kudu will beat out Aerospike, and probably use 
>>>>> cases where Aerospike will beat Kudu as well.
>>>>> 
>>>>> From my perspective, it would be great if you can share some details of 
>>>>> your workload, especially if there are some areas you're finding Kudu 
>>>>> lacking. Maybe we can spot some easy code changes we could make to 
>>>>> improve performance, or suggest a tuning variable you could change.
>>>>> 
>>>>> -Todd
>>>>> 
>>>>> 
>>>>>> On May 27, 2016, at 9:19 PM, Todd Lipcon <t...@cloudera.com 
>>>>>> <mailto:t...@cloudera.com>> wrote:
>>>>>> 
>>>>>> On Fri, May 27, 2016 at 8:20 PM, Benjamin Kim <bbuil...@gmail.com 
>>>>>> <mailto:bbuil...@gmail.com>> wrote:
>>>>>> Hi Mike,
>>>>>> 
>>>>>> First of all, thanks for the link. It looks like an interesting read. I 
>>>>>> checked that Aerospike is currently at version 3.8.2.3, and in the 
>>>>>> article, they are evaluating version 3.5.4. The main thing that 
>>>>>> impressed me was their claim that they can beat Cassandra and HBase by 
>>>>>> 8x for writing and 25x for reading. Their big claim to fame is that 
>>>>>> Aerospike can write 1M records per second with only 50 nodes. I wanted 
>>>>>> to see if this is real.
>>>>>> 
>>>>>> 1M records per second on 50 nodes is pretty doable by Kudu as well, 
>>>>>> depending on the size of your records and the insertion order. I've been 
>>>>>> playing with a ~70 node cluster recently and seen 1M+ writes/second 
>>>>>> sustained, and bursting above 4M. These are 1KB rows with 11 columns, 
>>>>>> and with pretty old HDD-only nodes. I think newer flash-based nodes 
>>>>>> could do better.
>>>>>>  
>>>>>> 
>>>>>> To answer your questions, we have a DMP with user profiles with many 
>>>>>> attributes. We create segmentation information off of these attributes 
>>>>>> to classify them. Then, we can target advertising appropriately for our 
>>>>>> sales department. Much of the data processing is for applying models on 
>>>>>> all or if not most of every profile’s attributes to find similarities 
>>>>>> (nearest neighbor/clustering) over a large number of rows when batch 
>>>>>> processing or a small subset of rows for quick online scoring. So, our 
>>>>>> use case is a typical advanced analytics scenario. We have tried HBase, 
>>>>>> but it doesn’t work well for these types of analytics.
>>>>>> 
>>>>>> I read, that Aerospike in the release notes, they did do many 
>>>>>> improvements for batch and scan operations.
>>>>>> 
>>>>>> I wonder what your thoughts are for using Kudu for this.
>>>>>> 
>>>>>> Sounds like a good Kudu use case to me. I've heard great things about 
>>>>>> Aerospike for the low latency random access portion, but I've also heard 
>>>>>> that it's _very_ expensive, and not particularly suited to the columnar 
>>>>>> scan workload. Lastly, I think the Apache license of Kudu is much more 
>>>>>> appealing than the AGPL3 used by Aerospike. But, that's not really a 
>>>>>> direct answer to the performance question :)
>>>>>>  
>>>>>> 
>>>>>> Thanks,
>>>>>> Ben
>>>>>> 
>>>>>> 
>>>>>>> On May 27, 2016, at 6:21 PM, Mike Percy <mpe...@cloudera.com 
>>>>>>> <mailto:mpe...@cloudera.com>> wrote:
>>>>>>> 
>>>>>>> Have you considered whether you have a scan heavy or a random access 
>>>>>>> heavy workload? Have you considered whether you always access / update 
>>>>>>> a whole row vs only a partial row? Kudu is a column store so has some 
>>>>>>> awesome performance characteristics when you are doing a lot of 
>>>>>>> scanning of just a couple of columns.
>>>>>>> 
>>>>>>> I don't know the answer to your question but if your concern is 
>>>>>>> performance then I would be interested in seeing comparisons from a 
>>>>>>> perf perspective on certain workloads.
>>>>>>> 
>>>>>>> Finally, a year ago Aerospike did quite poorly in a Jepsen test: 
>>>>>>> https://aphyr.com/posts/324-jepsen-aerospike 
>>>>>>> <https://aphyr.com/posts/324-jepsen-aerospike>
>>>>>>> 
>>>>>>> I wonder if they have addressed any of those issues.
>>>>>>> 
>>>>>>> Mike
>>>>>>> 
>>>>>>> On Friday, May 27, 2016, Benjamin Kim <bbuil...@gmail.com 
>>>>>>> <mailto:bbuil...@gmail.com>> wrote:
>>>>>>> I am just curious. How will Kudu compare with Aerospike 
>>>>>>> (http://www.aerospike.com <http://www.aerospike.com/>)? I went to a 
>>>>>>> Spark Roadshow and found out about this piece of software. It appears 
>>>>>>> to fit our use case perfectly since we are an ad-tech company trying to 
>>>>>>> leverage our user profiles data. Plus, it already has a Spark connector 
>>>>>>> and has a SQL-like client. The tables can be accessed using Spark SQL 
>>>>>>> DataFrames and, also, made into SQL tables for direct use with Spark 
>>>>>>> SQL ODBC/JDBC Thriftserver. I see from the work done here 
>>>>>>> http://gerrit.cloudera.org:8080/#/c/2992/ 
>>>>>>> <http://gerrit.cloudera.org:8080/#/c/2992/> that the Spark integration 
>>>>>>> is well underway and, from the looks of it lately, almost complete. I 
>>>>>>> would prefer to use Kudu since we are already a Cloudera shop, and Kudu 
>>>>>>> is easy to deploy and configure using Cloudera Manager. I also hope 
>>>>>>> that some of Aerospike’s speed optimization techniques can make it into 
>>>>>>> Kudu in the future, if they have not been already thought of or 
>>>>>>> included.
>>>>>>> 
>>>>>>> Just some thoughts…
>>>>>>> 
>>>>>>> Cheers,
>>>>>>> Ben
>>>>>>> 
>>>>>>> 
>>>>>>> -- 
>>>>>>> --
>>>>>>> Mike Percy
>>>>>>> Software Engineer, Cloudera
>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> -- 
>>>>>> Todd Lipcon
>>>>>> Software Engineer, Cloudera
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> -- 
>>>>> Todd Lipcon
>>>>> Software Engineer, Cloudera
>>>> 
>>> 
>>> 
>> 
>> 
>> 
>> 
>> -- 
>> Todd Lipcon
>> Software Engineer, Cloudera
> 
> 
> 
> 
> -- 
> Todd Lipcon
> Software Engineer, Cloudera

Re: Performance Question

Reply via email to