Re: scylladb

Kant Kodali Sun, 12 Mar 2017 03:52:49 -0700

Sorry I made some typo's here is a better version.

@Avi


"User-level scheduling is great for high performance I/O intensive
applications like databases and file systems." This is generally a claim
made by people who want to use user-level threads but I rarely had seen any
significant performance gain. Since you are claiming that you do. It would
be great if you can quantify that. The other day I have seen a benchmark of
a Golang server which supports user level threads/green threads natively
and it was able to handle 10K concurrent requests. Even Nginx which is
written in C and uses kernel threads can handle that many with Non-blocking
I/O. We all know concurrency is not parallelism.

One may have to pay for something which could be any of the following.

*Duplication of the schedulers*
M:N requires two schedulers which basically do same work, one at user level
and one in kernel. This is undesirable. It requires frequent data
communications between kernel and user space for scheduling information
transference.

Duplication takes more space in both Dcache and Icache for scheduling than
a single scheduler. It is highly undesirable if cache misses are caused by
the schedulers but the application, because a L2 cache miss could be more
expensive than a kernel thread switch. Then the additional scheduler might
become a trouble maker! In this case, to save kernel trappings does not
justify a user-scheduler, which is more truen when the processors are
providing faster and faster kernel trapping execution.

*Thread local data maintenance*
M:N has to maintain thread specific data, which are already provided by
kernel for kernel thread, such as the TLS data, error number. To provide
the same feature for user threads is not straightforward, because, for
example, the error number is returned for system call failure and supported
by kernel. User-level support degrades system performance and increases
system complexity.

*System info oblivious*
Kernel scheduler is close to underlying platform and architecture. It can
take advantage of their features. This is difficult for user thread library
because it's a layer at user level. User threads are second-order entities
in the system. If a kernel thread uses a GDT slot for TLS data, a user
thread perhaps can only use an LDT slot for TLS data. With increasingly
more supports available from the new processors for threading/scheduling
(Hyperthreading, NUMA, many-core), the second order nature seriously limits
the ability of M:N threading.

On Sun, Mar 12, 2017 at 1:33 AM, Kant Kodali <k...@peernova.com> wrote:

> Sorry I made some typo's here is a better version.
>
> @Avi
>
> "User-level scheduling is great for high performance I/O intensive
> applications like databases and file systems." This is generally a claim
> made by people who want to use user-level threads but I rarely had seen any
> significant performance gain. Since you are claiming that you do. It would
> be great if you can quantify that. The other day I have seen a benchmark of
> a Golang server which supports user level threads/green threads natively
> and it was able to handle 10K concurrent requests. Even Nginx which is
> written in C and uses kernel threads can handle that many with Non-blocking
> I/O. We all know concurrency is not parallelism.
>
> One may have to pay for something which could be any of the following.
>
> *Duplication of the schedulers*
> M:N requires two schedulers which basically do same work, one at user
> level and one in kernel. This is undesirable. It requires frequent data
> communications between kernel and user space for scheduling information
> transference.
>
> Duplication takes more space in both Dcache and Icache for scheduling than
> a single scheduler. It is highly undesirable if cache misses are caused by
> the schedulers but the application, because a L2 cache miss could be more
> expensive than a kernel thread switch. Then the additional scheduler might
> become a trouble maker! In this case, to save kernel trappings does not
> justify a user-scheduler, which is more truen when the processors are
> providing faster and faster kernel trapping execution.
>
> *Thread local data maintenance*
> M:N has to maintain thread specific data, which are already provided by
> kernel for kernel thread, such as the TLS data, error number. To provide
> the same feature for user threads is not straightforward, because, for
> example, the error number is returned for system call failure and supported
> by kernel. User-level support degrades system performance and increases
> system complexity.
>
> *System info oblivious*
> Kernel scheduler is close to underlying platform and architecture. It can
> take advantage of their features. This is difficult for user thread library
> because it's a layer at user level. User threads are second-order entities
> in the system. If a kernel thread uses a GDT slot for TLS data, a user
> thread perhaps can only use an LDT slot for TLS data. With increasingly
> more supports available from the new processors for threading/scheduling
> (Hyperthreading, NUMA, many-core), the second order nature seriously limits
> the ability of M:N threading.
>
> On Sun, Mar 12, 2017 at 1:15 AM, Kant Kodali <k...@peernova.com> wrote:
>
>> Hi Dor,
>>
>> I will reply to this on a separate thread since there seem to be good
>> knowledge exchange on this thread!
>>
>> Thanks!
>> kant
>>
>> On Sun, Mar 12, 2017 at 12:48 AM, Dor Laor <d...@scylladb.com> wrote:
>>
>>> Hi Kant,
>>>
>>> 2.0 is scheduled around July. Do you need LWT, it will be part of it.
>>> If you need other features, they'll be inside before. That status page
>>> will get refreshed this week.
>>>
>>> About gains, it depends on the hardware. With large physical machines
>>> (or the large i3.16xl) we can show 10X and beyond. With smaller machines
>>> it may be 1.5-2X but do check the c3.2xl (4 core machine) benchmark on
>>> our
>>> site, there are lots of factors we do better.
>>>
>>> Actually the right benchmark is the otherway around and will show larger
>>> value
>>> of Scylla - you define your workload in terms of OPS and latency and
>>> then you
>>> find the right set of hardware for Scylla and Cassandra and compare.
>>> I am *sure* that the difference is substantial.
>>>
>>> Can you please elaborate on your workload and the type of machines you
>>> use?
>>> Schema and row length plus access pattern are also welcomed.
>>>
>>> Cheers,
>>> Dor
>>>
>>>
>>> On Sun, Mar 12, 2017 at 12:13 AM, Kant Kodali <k...@peernova.com> wrote:
>>>
>>>> Progress is all that matters and I do think you guys had assembled a
>>>> smart team. Looking at that status page I am willing to try ScyllaDB
>>>> whenever 2.0 version is released. we have big banks as our clients and I
>>>> would most likely replace it if I see a 10X or some significant improvement
>>>> with our workloads but in case if it only gives me 1.5-2X I wouldn't be so
>>>> inclined to go through all the work.
>>>>
>>>> On Sat, Mar 11, 2017 at 10:38 PM, benjamin roth <brs...@gmail.com>
>>>> wrote:
>>>>
>>>>> There is no reason to be angry. This is progress. This is the circle
>>>>> of live.
>>>>>
>>>>> It happens anywhere at any time.
>>>>>
>>>>> Am 12.03.2017 07:34 schrieb "Dor Laor" <d...@scylladb.com>:
>>>>>
>>>>>> On Sat, Mar 11, 2017 at 10:02 PM, Jeff Jirsa <jji...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 2017-03-10 09:57 (-0800), Rakesh Kumar wrote:
>>>>>>> > Cassanda vs Scylla is a valid comparison because they both are
>>>>>>> compatible. Scylla is a drop-in replacement for Cassandra.
>>>>>>>
>>>>>>> No, they aren't, and no, it isn't
>>>>>>>
>>>>>>
>>>>>> Jeff is angry with us for some reason. I don't know why, it's natural
>>>>>> that when
>>>>>> a new opponent there are objections and the proof lies on us.
>>>>>> We go through great deal of doing it and we don't just throw comments
>>>>>> without backing.
>>>>>>
>>>>>> Scylla IS a drop in replacement for C*. We support the same CQL (from
>>>>>> version 1.7 it's cql 3.3.1, protocol v4), the same SStable format (based 
>>>>>> on
>>>>>> 2.1.8). In 1.7 release we support cql uploader
>>>>>> from 3.x. We will support the SStable format of 3.x natively in 3
>>>>>> month time. Soon all of the feature set will be implemented. We always 
>>>>>> have
>>>>>> been using this page (not 100% up to date, we'll update it this week):
>>>>>> http://www.scylladb.com/technology/status/
>>>>>>
>>>>>> We add a jmx-proxy daemon in java in order to make the transition as
>>>>>> smooth as possible. Almost all the nodetool commands just work, for sure
>>>>>> all the important ones.
>>>>>> Btw: we have a RESTapi and Prometheus formats, much better than the
>>>>>> hairy jmx one.
>>>>>>
>>>>>> Spark, Kairosdb, Presto and probably Titan (we add Thrift just for
>>>>>> legacy users and we don't intend
>>>>>> to decommission an api).
>>>>>>
>>>>>> Regarding benchmarks, if someone finds a flaw in them, we'll do the
>>>>>> best to fix it.
>>>>>> Let's ignore them and just here what our users have to say:
>>>>>> http://www.scylladb.com/users/
>>>>>>
>>>>>>
>>>>>>
>>>>
>>>
>>
>

Re: scylladb

Reply via email to