Re: byte and short data types use cases

2023-06-10 Thread Cecil Ward via Digitalmars-d-learn

On Sunday, 11 June 2023 at 00:05:52 UTC, H. S. Teoh wrote:
On Sat, Jun 10, 2023 at 09:58:12PM +, Cecil Ward via 
Digitalmars-d-learn wrote:

On Friday, 9 June 2023 at 15:07:54 UTC,



[...]

On contemporary machines, the CPU is so fast that memory access 
is a much bigger bottleneck than processing speed. So unless an 
operation is being run hundreds of thousands of times, you're 
not likely to notice the difference. OTOH, accessing memory is 
slow (that's why the memory cache hierarchy exists). So utf8 is 
actually advantageous here: it fits in a smaller space, so it's 
faster to fetch from memory; more of it can fit in the CPU 
cache, so less DRAM roundtrips are needed. Which is faster.  
Yes you need extra processing because of the variable-width 
encoding, but it happens mostly inside the CPU, which is fast 
enough that it generally outstrips the memory roundtrip 
overhead. So unless you're doing something *really* complex 
with the utf8 data, it's an overall win in terms of 
performance. The CPU gets to do what it's good at -- running 
complex code -- and the memory cache gets to do what it's good 
at: minimizing the amount of slow DRAM roundtrips.




I completely agree with H. S. Teoh. That is exactly what I was 
going to say. The point is that considerations like this have to 
be thought through carefully and width of types really does 
matter in the cases brought up.


But outside these cases, as I said earlier, stick to uint, size_t 
and ulong, or uint32_t and uint64_t if exact size is vital, but 
do also check out the other std.stdint types too as very 
occasionally they are needed.




Re: byte and short data types use cases

2023-06-10 Thread H. S. Teoh via Digitalmars-d-learn
On Sat, Jun 10, 2023 at 09:58:12PM +, Cecil Ward via Digitalmars-d-learn 
wrote:
> On Friday, 9 June 2023 at 15:07:54 UTC, Murloc wrote:
[...]
> > So you can optimize memory usage by using arrays of things smaller
> > than `int` if these are enough for your purposes, but what about
> > using these instead of single variables, for example as an iterator
> > in a loop, if range of such a data type is enough for me? Is there
> > any advantages on doing that?
> 
> A couple of other important use-cases came to me. The first one is
> unicode which has three main representations, utf-8 which is a stream
> of bytes each character can be several bytes, utf-16 where a character
> can be one or rarely two 16-bit words, and utf32 - a stream of 32-bit
> words, one per character. The simplicity of the latter is a huge deal
> in speed efficiency, but utf32 takes up almost four times as memory as
> utf-8 for western european languages like english or french. The
> four-to-one ratio means that the processor has to pull in four times
> the amount of memory so that’s a slowdown, but on the other hand it is
> processing the same amount of characters whichever way you look at it,
> and in utf8 the cpu is having to parse more bytes than characters
> unless the text is entirely ASCII-like.
[...]

On contemporary machines, the CPU is so fast that memory access is a
much bigger bottleneck than processing speed. So unless an operation is
being run hundreds of thousands of times, you're not likely to notice
the difference. OTOH, accessing memory is slow (that's why the memory
cache hierarchy exists). So utf8 is actually advantageous here: it fits
in a smaller space, so it's faster to fetch from memory; more of it can
fit in the CPU cache, so less DRAM roundtrips are needed. Which is
faster.  Yes you need extra processing because of the variable-width
encoding, but it happens mostly inside the CPU, which is fast enough
that it generally outstrips the memory roundtrip overhead. So unless
you're doing something *really* complex with the utf8 data, it's an
overall win in terms of performance. The CPU gets to do what it's good
at -- running complex code -- and the memory cache gets to do what it's
good at: minimizing the amount of slow DRAM roundtrips.


T

-- 
It said to install Windows 2000 or better, so I installed Linux instead.


Re: byte and short data types use cases

2023-06-10 Thread Cecil Ward via Digitalmars-d-learn

On Saturday, 10 June 2023 at 21:58:12 UTC, Cecil Ward wrote:

On Friday, 9 June 2023 at 15:07:54 UTC, Murloc wrote:

On Friday, 9 June 2023 at 12:56:20 UTC, Cecil Ward wrote:

[...]


Is this some kind of property? Where can I read more about 
this?


My last example is comms. Protocol headers need economical narrow 
data types because of efficiency, it’s all about packing as much 
user data as possible into each packet and fatter, longer headers 
reduce the amount of user data as the total has a hard limit on 
it. A pair of headers totalling 40 bytes in IPv4+TCP takes up 
nearly 3% of the total length allowed, so that’s a ~3% speed 
loss, as the headers are just dead weight. So here narrow types 
help comms speed.


Re: byte and short data types use cases

2023-06-10 Thread Cecil Ward via Digitalmars-d-learn

On Friday, 9 June 2023 at 15:07:54 UTC, Murloc wrote:

On Friday, 9 June 2023 at 12:56:20 UTC, Cecil Ward wrote:

On Friday, 9 June 2023 at 11:24:38 UTC, Murloc wrote:

If you have four ubyte variables in a struct and then
an array of them, then you are getting optimal memory usage.


Is this some kind of property? Where can I read more about this?

So you can optimize memory usage by using arrays of things 
smaller than `int` if these are enough for your purposes, but 
what about using these instead of single variables, for example 
as an iterator in a loop, if range of such a data type is 
enough for me? Is there any advantages on doing that?


A couple of other important use-cases came to me. The first one 
is unicode which has three main representations, utf-8 which is a 
stream of bytes each character can be several bytes, utf-16 where 
a character can be one or rarely two 16-bit words, and utf32 - a 
stream of 32-bit words, one per character. The simplicity of the 
latter is a huge deal in speed efficiency, but utf32 takes up 
almost four times as memory as utf-8 for western european 
languages like english or french. The four-to-one ratio means 
that the processor has to pull in four times the amount of memory 
so that’s a slowdown, but on the other hand it is processing the 
same amount of characters whichever way you look at it, and in 
utf8 the cpu is having to parse more bytes than characters unless 
the text is entirely ASCII-like.


The second use-case is about SIMD. Intel and AMD x86 machines 
have vector arithmetic units that are either 16, 32 or 64 bytes 
wide depending on how recent the model is. Taking for example a 
post-2013 Intel Haswell CPU, which has 32-byte wide units, if you 
choose smaller width data types you can fit more in the vector 
unit - that’s how it works, and fitting in more integers or 
floating point numbers of half width means that you can process 
twice as many in one instruction. On our Haswell that means four 
doubles or four quad words, or eight 32-bit floats or 32-bit 
uint32_ts, and similar doubling s’s for uint16_t. So here width 
economy directly relates to double speed.


Re: byte and short data types use cases

2023-06-10 Thread Salih Dincer via Digitalmars-d-learn

On Friday, 9 June 2023 at 23:51:07 UTC, Basile B. wrote:
Yes, a classsic resource is 
http://www.catb.org/esr/structure-packing/


So you can optimize memory usage by using arrays of things 
smaller than `int` if these are enough for your purposes,


So, is the sorting correct in a structure like the one below with 
partial overlap?


```d
struct DATA
{
union
{
ulong bits;
ubyte[size] cell;
}
enum size = 5;
bool last;
alias last this;

size_t length, limit, index = ulong.sizeof;

bool empty()
{
return index / ulong.sizeof >= limit; }

ubyte[] data;
ubyte front()
{
//..
```
This code snippet is from an actual working my project.  What is 
done is to process 40 bits of data.


SDB@79



Re: Problem with dmd-2.104.0 -dip1000 & @safe

2023-06-10 Thread Dennis via Digitalmars-d-learn

On Friday, 9 June 2023 at 04:05:27 UTC, An Pham wrote:

Getting with below error for following codes. Look like bug?


Filed as https://issues.dlang.org/show_bug.cgi?id=23985

You can work around it by marking parameter `a` as `return scope`