Re: mixin template's alias parameter ... ignored ?

someone via Digitalmars-d-learn Tue, 13 Jul 2021 07:11:03 -0700

On Tuesday, 13 July 2021 at 05:37:49 UTC, ag0aep6g wrote:

On 13.07.21 03:03, someone wrote:
On Monday, 12 July 2021 at 23:28:29 UTC, ag0aep6g wrote:
[...]
I'm not sure where we stand with `in`
You mean *we* = D developers ?
Yes. Let me rephrase and elaborate: I'm not sure what thecurrent status of `in` is. It used to mean `const scope`. ButDIP1000 changes the effects of `scope` and there was somediscussion about its relation to `in`.
Checking the spec, it says that `in` simply means `const`unless you use `-preview=in`. The preview switch makes it`const scope` again, but that's not all. There's also somethingabout passing by reference.
https://dlang.org/spec/function.html#in-params

ACK. So for the time being I'll be reverting all my inputparameters to const (unless ref or out of course) and when thewhole in DIP matter resolves (one way or the other) I'll revertthem (or not) accordingly. Parameters declared in read morenaturally (and akin to out) than const but is form not functionwhat I need to get right right now.

For a UDT like mine I think it has a lot of sense because whenI think of a string and I want to chop/count/whatever on it mymind works one-based not zero-based. Say "abc" needs b my mindworks a lot easier mid("abc", 2, 1) than mid("abc", 1, 1) andbesides I am *not* returning a range or a reference slice to arange or whatever I am returning a whole new stringconstruction. If I would be returning a range I will followcommon sense since I don't know what will be done thereafterof course.
I think you're setting yourself up for off-by-one bugs by goingagainst the grain like that. Your functions are one-based. Therest of the D world, including the standard library, iszero-based. You're bound to forget to account for thedifference.


And I think you have a good point. I'll reconsider.

But it's your code, and you can do whatever you want, ofcourse. Just looked like it might be a mistake.

All in all the whole module was updated accordingly and it seemsit is working as expected (further testing needed) but, in themeantime, I learned a lot of things following the advice given byyou, Ali, and others in this forum:


```d

/// implementation-bugs [-] using foreach (with this structure)always misses the last grapheme‐cluster … possible phobos bug #20483 @ unittest's last line

/// implementation‐tasks [+] reconsider making this whole UDTzero‐based as suggested by ag0aep6g—has a good point/// implementation‐tasks [+] reconsider excessive cast usage assuggested by Ali: bypassing compiler checks could be potentiallyharmful … cast and integer promotion @http://ddili.org/ders/d.en/cast.html/// implementation‐tasks [-] for the time being input parametersare declared const instead of in; eventually they'll be back toin when the related DIP was setted once and for all; but,definetely—not scope const


/// implementation‐tasks‐possible [-] pad[L|R]
/// implementation‐tasks‐possible [-] replicate/repeat
/// implementation‐tasks‐possible [-] replace(string, string)

/// implementation‐tasks‐possible [-] translate(string, string) …same‐size strings matching one‐to‐one

/// usage: array slicing can be used for usual things like:left() right() substr() etc … mainly when grapheme‐clusters arenot expected at all/// usage: array slicing needs a zero‐based first range argumentand a second one one‐based (or one‐past‐beyond; which it issomehow … counter‐intuitive


module fw.types.UniCode;

import std.algorithm : map, joiner;
import std.array : array;
import std.conv : to;

import std.range : walkLength, take, tail, drop, dropBack; ///repeat, padLeft, padRight

import std.stdio;
import std.uni : Grapheme, byGrapheme;

/// within this file: gudtUGC

shared static this() { } /// the following will be executedonly‐once per‐app:static this() { } /// the following will be executedonly‐once per‐thread:static ~this() { } /// the following will be executedonly‐once per‐thread:shared static ~this() { } /// the following will be executedonly‐once per‐app:




alias stringUGC = Grapheme;
alias stringUGC08 = gudtUGC!(stringUTF08);
alias stringUGC16 = gudtUGC!(stringUTF16);
alias stringUGC32 = gudtUGC!(stringUTF32);
alias stringUTF08 = string;  /// same as immutable(char )[];
alias stringUTF16 = wstring; /// same as immutable(wchar)[];
alias stringUTF32 = dstring; /// same as immutable(dchar)[];

/// mixin templateUGC!(stringUTF08, r"gudtUGC08"d);
/// mixin templateUGC!(stringUTF16, r"gudtUGC16"d);
/// mixin templateUGC!(stringUTF32, r"gudtUGC32"d);

/// template templateUGC (typeStringUTF, alias lstrStructureID) {/// if these were possible there will be no need for stringUGC##aliases in main()

public struct gudtUGC(typeStringUTF) { /// UniCodegrapheme‐cluster‐aware string manipulation (implemented forone‐based operations)


   /// provides: public property size_t count

   /// provides: public size_t decode(typeStringUTF strSequence)
   /// provides: public typeStringUTF encode()

/// provides: public gudtUGC!(typeStringUTF) take(size_tintStart, size_t intCount = 1)/// provides: public gudtUGC!(typeStringUTF) takeL(size_tintCount)/// provides: public gudtUGC!(typeStringUTF) takeR(size_tintCount)/// provides: public gudtUGC!(typeStringUTF) chopL(size_tintCount)/// provides: public gudtUGC!(typeStringUTF) chopR(size_tintCount)/// provides: public gudtUGC!(typeStringUTF) padL(size_tintCount, typeStringUTF strPadding = r" ")/// provides: public gudtUGC!(typeStringUTF) padR(size_tintCount, typeStringUTF strPadding = r" ")

/// provides: public typeStringUTF takeasUTF(size_t intStart,size_t intCount = 1)

   /// provides: public typeStringUTF takeLasUTF(size_t intCount)
   /// provides: public typeStringUTF takeRasUTF(size_t intCount)
   /// provides: public typeStringUTF chopLasUTF(size_t intCount)
   /// provides: public typeStringUTF chopRasUTF(size_t intCount)

/// provides: public typeStringUTF padL(size_t intCount,typeStringUTF strPadding = r" ")/// provides: public typeStringUTF padR(size_t intCount,typeStringUTF strPadding = r" ")

/// usage; eg: stringUGC32("äëåčñœß … russian = русский 🇷🇺 ≠ 🇯🇵日本語 = japanese"d).take(35, 3).take(1,2).take(1,1).encode(); /// 日/// usage; eg: stringUGC32("äëåčñœß … russian = русский 🇷🇺 ≠ 🇯🇵日本語 = japanese"d).take(35).encode(); /// 日/// usage; eg: stringUGC32("äëåčñœß … russian = русский 🇷🇺 ≠ 🇯🇵日本語 = japanese"d).takeasUTF(35); /// 日


   void popFront() { ++pintSequenceCurrent; }

bool empty() { return pintSequenceCurrent ==pintSequenceCount; }typeStringUTF front() { return takeasUTF(pintSequenceCurrent);}


   private stringUGC[] pugcSequence;
   private size_t pintSequenceCount = cast(size_t) 0;
   private size_t pintSequenceCurrent = cast(size_t) 0;

   @property public size_t count() { return pintSequenceCount; }

   this(
      const typeStringUTF lstrSequence
      ) {

      /// (1) given UTF‐encoded sequence

      decode(lstrSequence);

   }

@safe public size_t decode( /// UniCode (UTF‐encoded →grapheme‐cluster) sequence

      const typeStringUTF lstrSequence
      ) {

      /// (1) given UTF‐encoded sequence

      size_t lintSequenceCount = cast(size_t) 0;

      if (lstrSequence is null) {

         pugcSequence = null;
         pintSequenceCount = cast(size_t) 0;
         pintSequenceCurrent = cast(size_t) 0;

      } else {

         pugcSequence = lstrSequence.byGrapheme.array;
         pintSequenceCount = pugcSequence.walkLength;
         pintSequenceCurrent = cast(size_t) 1;

         lintSequenceCount = pintSequenceCount;

      }

      return lintSequenceCount;

   }

@safe public typeStringUTF encode() { /// UniCode(grapheme‐cluster → UTF‐encoded) sequence


      typeStringUTF lstrSequence = null;

      if (pintSequenceCount >= cast(size_t) 1) {

         lstrSequence = pugcSequence
            .map!((ref g) => g[])
            .joiner
            .to!(typeStringUTF)
            ;

      }

      return lstrSequence;

   }

@safe public gudtUGC!(typeStringUTF) take( /// UniCode(grapheme‐cluster → grapheme‐cluster) sequence

      const size_t lintStart,
      const size_t lintCount = cast(size_t) 1
      ) {

      /// (1) given start position >= 1
      /// (2) given count >= 1

      gudtUGC!(typeStringUTF) lugcSequence;

if (lintStart >= cast(size_t) 1 && lintCount >=cast(size_t) 1) {

/// eg#1: takeasUTF(1,3) → range#1=start-1=1-1=0 andrange#2=range#1+count=0+3=3 → 0..3/// eg#1: takeasUTF(6,3) → range#2=start-1=6-1=5 andrange#2=range#1+count=5+3=8 → 5..8

/// eg#2: takeasUTF(01,1) → range#1=start-1=01-1=00 andrange#2=range#1+count=00+1=01 → 00..01/// eg#2: takeasUTF(50,1) → range#2=start-1=50-1=49 andrange#2=range#1+count=49+1=50 → 49..50


         size_t lintRange1 = lintStart - cast(size_t) 1;
         size_t lintRange2 = lintRange1 + lintCount;

         if (lintRange2 <= pintSequenceCount) {

lugcSequence =gudtUGC!(typeStringUTF)(pugcSequence[lintRange1..lintRange2]

               .map!((ref g) => g[])
               .joiner
               .to!(typeStringUTF)
               );

         }

      }

      return lugcSequence;

   }

@safe public gudtUGC!(typeStringUTF) takeL( /// UniCode(grapheme‐cluster → grapheme‐cluster) sequence

      const size_t lintCount
      ) {

      /// (1) given count >= 1

      gudtUGC!(typeStringUTF) lugcSequence;

if (lintCount >= cast(size_t) 1 && lintCount <=pintSequenceCount) {


         lugcSequence = gudtUGC!(typeStringUTF)(pugcSequence
            .take(lintCount)
            .map!((ref g) => g[])
            .joiner
            .to!(typeStringUTF)
            );

      }

      return lugcSequence;

   }

@safe public gudtUGC!(typeStringUTF) takeR( /// UniCode(grapheme‐cluster → grapheme‐cluster) sequence

      const size_t lintCount
      ) {

      /// (1) given count >= 1

      gudtUGC!(typeStringUTF) lugcSequence;

if (lintCount >= cast(size_t) 1 && lintCount <=pintSequenceCount) {


         lugcSequence = gudtUGC!(typeStringUTF)(pugcSequence
            .tail(lintCount)
            .map!((ref g) => g[])
            .joiner
            .to!(typeStringUTF)
            );

      }

      return lugcSequence;

   }

@safe public gudtUGC!(typeStringUTF) chopL( /// UniCode(grapheme‐cluster → grapheme‐cluster) sequence

      const size_t lintCount
      ) {

      /// (1) given count >= 1

      gudtUGC!(typeStringUTF) lugcSequence;

if (lintCount >= cast(size_t) 1 && lintCount <=pintSequenceCount) {


         lugcSequence = gudtUGC!(typeStringUTF)(pugcSequence
            .drop(lintCount)
            .map!((ref g) => g[])
            .joiner
            .to!(typeStringUTF)
            );

      }

      return lugcSequence;

   }

@safe public gudtUGC!(typeStringUTF) chopR( /// UniCode(grapheme‐cluster → grapheme‐cluster) sequence

      const size_t lintCount
      ) {

      /// (1) given count >= 1

      gudtUGC!(typeStringUTF) lugcSequence;

if (lintCount >= cast(size_t) 1 && lintCount <=pintSequenceCount) {


         lugcSequence = gudtUGC!(typeStringUTF)(pugcSequence
            .dropBack(lintCount)
            .map!((ref g) => g[])
            .joiner
            .to!(typeStringUTF)
            );

      }

      return lugcSequence;

   }

@safe public typeStringUTF takeasUTF( /// UniCode(grapheme‐cluster → UTF‐encoded) sequence

      const size_t lintStart,
      const size_t lintCount = cast(size_t) 1
      ) {

      /// (1) given start position >= 1
      /// (2) given count >= 1

      typeStringUTF lstrSequence = null;

if (lintStart >= cast(size_t) 1 && lintCount >=cast(size_t) 1) {

/// eg#1: takeasUTF(1,3) → range#1=start-1=1-1=0 andrange#2=range#1+count=0+3=3 → 0..3/// eg#1: takeasUTF(6,3) → range#2=start-1=6-1=5 andrange#2=range#1+count=5+3=8 → 5..8

/// eg#2: takeasUTF(01,1) → range#1=start-1=01-1=00 andrange#2=range#1+count=00+1=01 → 00..01/// eg#2: takeasUTF(50,1) → range#2=start-1=50-1=49 andrange#2=range#1+count=49+1=50 → 49..50


         size_t lintRange1 = lintStart - cast(size_t) 1;
         size_t lintRange2 = lintRange1 + lintCount;

         if (lintRange2 <= pintSequenceCount) {

            lstrSequence = pugcSequence[lintRange1..lintRange2]
               .map!((ref g) => g[])
               .joiner
               .to!(typeStringUTF)
               ;

         }

      }

      return lstrSequence;

   }

@safe public typeStringUTF takeLasUTF( /// UniCode(grapheme‐cluster → UTF‐encoded) sequence

      const size_t lintCount
      ) {

      /// (1) given count >= 1

      typeStringUTF lstrSequence = null;

if (lintCount >= cast(size_t) 1 && lintCount <=pintSequenceCount) {


         lstrSequence = pugcSequence
            .take(lintCount)
            .map!((ref g) => g[])
            .joiner
            .to!(typeStringUTF)
            ;

      }

      return lstrSequence;

   }

@safe public typeStringUTF takeRasUTF( /// UniCode(grapheme‐cluster → UTF‐encoded) sequence

      const size_t lintCount
      ) {

      /// (1) given count >= 1

      typeStringUTF lstrSequence = null;

if (lintCount >= cast(size_t) 1 && lintCount <=pintSequenceCount) {


         lstrSequence = pugcSequence
            .tail(lintCount)
            .map!((ref g) => g[])
            .joiner
            .to!(typeStringUTF)
            ;

      }

      return lstrSequence;

   }

@safe public typeStringUTF chopLasUTF( /// UniCode(grapheme‐cluster → UTF‐encoded) sequence

      const size_t lintCount
      ) {

      /// (1) given count >= 1

      typeStringUTF lstrSequence = null;

if (lintCount >= cast(size_t) 1 && lintCount <=pintSequenceCount) {


         lstrSequence = pugcSequence
            .drop(lintCount)
            .map!((ref g) => g[])
            .joiner
            .to!(typeStringUTF)
            ;

      }

      return lstrSequence;

   }

@safe public typeStringUTF chopRasUTF( /// UniCode(grapheme‐cluster → UTF‐encoded) sequence

      const size_t lintCount
      ) {

      /// (1) given count >= 1

      typeStringUTF lstrSequence = null;

if (lintCount >= cast(size_t) 1 && lintCount <=pintSequenceCount) {


         lstrSequence = pugcSequence
            .dropBack(lintCount)
            .map!((ref g) => g[])
            .joiner
            .to!(typeStringUTF)
            ;

      }

      return lstrSequence;

   }

@safe public typeStringUTF padLasUTF( /// UniCode(grapheme‐cluster → UTF‐encoded) sequence

      const size_t lintCount,
      const typeStringUTF lstrPadding = cast(typeStringUTF) r" "
      ) {

      /// (1) given count >= 1
      /// [2] given padding (default is a single blank space)

      typeStringUTF lstrSequence = null;

if (lintCount >= cast(size_t) 1 && lintCount >pintSequenceCount) {


         lstrSequence = null; /// pending

      }

      return lstrSequence;

   }

@safe public typeStringUTF padRasUTF( /// UniCode(grapheme‐cluster → UTF‐encoded) sequence

      const size_t lintCount,
      const typeStringUTF lstrPadding = cast(typeStringUTF) r" "
      ) {

      /// (1) given count >= 1
      /// [2] given padding (default is a single blank space)

      typeStringUTF lstrSequence = null;

if (lintCount >= cast(size_t) 1 && lintCount >pintSequenceCount) {


         lstrSequence = null; /// pending

      }

      return lstrSequence;

   }

}

unittest {

   version (useUTF08) {

stringUTF08 lstrSequence1 =r"12345678901234567890123456789012345678901234567890"c;stringUTF08 lstrSequence2 =r"1234567890АВГДЕЗИЙКЛABCDEFGHIJabcdefghijQRSTUVWXYZ"c;stringUTF08 lstrSequence3 = "äëåčñœß … russian = русский 🇷🇺 ≠ 🇯🇵日本語 = japanese 😎"c;

   }

   version (useUTF16) {

stringUTF16 lstrSequence1 =r"12345678901234567890123456789012345678901234567890"w;stringUTF16 lstrSequence2 =r"1234567890АВГДЕЗИЙКЛABCDEFGHIJabcdefghijQRSTUVWXYZ"w;stringUTF16 lstrSequence3 = "äëåčñœß … russian = русский 🇷🇺 ≠ 🇯🇵日本語 = japanese 😎"w;

   }

   version (useUTF32) {

stringUTF32 lstrSequence1 =r"12345678901234567890123456789012345678901234567890"d;stringUTF32 lstrSequence2 =r"1234567890АВГДЕЗИЙКЛABCDEFGHIJabcdefghijQRSTUVWXYZ"d;stringUTF32 lstrSequence3 = "äëåčñœß … russian = русский 🇷🇺 ≠ 🇯🇵日本語 = japanese 😎"d;

   }

   size_t lintSequence1sizeUTF = lstrSequence1.length;
   size_t lintSequence2sizeUTF = lstrSequence2.length;
   size_t lintSequence3sizeUTF = lstrSequence3.length;

   size_t lintSequence1sizeUGA = lstrSequence1.walkLength;
   size_t lintSequence2sizeUGA = lstrSequence2.walkLength;
   size_t lintSequence3sizeUGA = lstrSequence3.walkLength;

size_t lintSequence1sizeUGC =lstrSequence1.byGrapheme.walkLength;size_t lintSequence2sizeUGC =lstrSequence2.byGrapheme.walkLength;size_t lintSequence3sizeUGC =lstrSequence3.byGrapheme.walkLength;


   assert(lintSequence1sizeUGC == cast(size_t) 50);
   assert(lintSequence2sizeUGC == cast(size_t) 50);
   assert(lintSequence3sizeUGC == cast(size_t) 50);

   assert(lintSequence1sizeUGA == cast(size_t) 50);
   assert(lintSequence2sizeUGA == cast(size_t) 50);
   assert(lintSequence3sizeUGA == cast(size_t) 52);

   version (useUTF08) {
   assert(lintSequence1sizeUTF == cast(size_t) 50);
   assert(lintSequence2sizeUTF == cast(size_t) 60);
   assert(lintSequence3sizeUTF == cast(size_t) 91);
   }

   version (useUTF16) {
   assert(lintSequence1sizeUTF == cast(size_t) 50);
   assert(lintSequence2sizeUTF == cast(size_t) 50);
   assert(lintSequence3sizeUTF == cast(size_t) 57);
   }

   version (useUTF32) {
   assert(lintSequence1sizeUTF == cast(size_t) 50);
   assert(lintSequence2sizeUTF == cast(size_t) 50);
   assert(lintSequence3sizeUTF == cast(size_t) 52);
   }

/// the following should be the same regardless of theencoding being used and is the whole point of this UDT being made:

version (useUTF08) { alias stringUTF = stringUTF08;stringUGC08 lugcSequence3 = stringUGC08(lstrSequence3); }version (useUTF16) { alias stringUTF = stringUTF16;stringUGC16 lugcSequence3 = stringUGC16(lstrSequence3); }version (useUTF32) { alias stringUTF = stringUTF32;stringUGC32 lugcSequence3 = stringUGC32(lstrSequence3); }


   assert(lugcSequence3.encode() == lstrSequence3);

assert(lugcSequence3.take(35, 3).take(1,2).take(1,1).encode()== cast(stringUTF) r"日");

assert(lugcSequence3.take(21).encode() == cast(stringUTF)r"р");assert(lugcSequence3.take(27).encode() == cast(stringUTF)r"й");assert(lugcSequence3.take(35).encode() == cast(stringUTF)r"日");assert(lugcSequence3.take(37).encode() == cast(stringUTF)r"語");assert(lugcSequence3.take(21, 7).encode() == cast(stringUTF)r"русский");assert(lugcSequence3.take(35, 3).encode() == cast(stringUTF)r"日本語");


   assert(lugcSequence3.takeasUTF(21) == cast(stringUTF) r"р");
   assert(lugcSequence3.takeasUTF(27) == cast(stringUTF) r"й");
   assert(lugcSequence3.takeasUTF(35) == cast(stringUTF) r"日");
   assert(lugcSequence3.takeasUTF(37) == cast(stringUTF) r"語");

assert(lugcSequence3.takeasUTF(21, 7) == cast(stringUTF)r"русский");assert(lugcSequence3.takeasUTF(35, 3) == cast(stringUTF)r"日本語");

assert(lugcSequence3.takeL(1).encode() == cast(stringUTF)r"ä");assert(lugcSequence3.takeR(1).encode() == cast(stringUTF)r"😎");assert(lugcSequence3.takeL(7).encode() == cast(stringUTF)r"äëåčñœß");assert(lugcSequence3.takeR(16).encode() == cast(stringUTF)r"日本語 = japanese 😎");


   assert(lugcSequence3.takeLasUTF(1) == cast(stringUTF) r"ä");
   assert(lugcSequence3.takeRasUTF(1) == cast(stringUTF) r"😎");

assert(lugcSequence3.takeLasUTF(7) == cast(stringUTF)r"äëåčñœß");assert(lugcSequence3.takeRasUTF(16) == cast(stringUTF) r"日本語 =japanese 😎");

assert(lugcSequence3.chopL(10).encode() == cast(stringUTF)r"russian = русский 🇷🇺 ≠ 🇯🇵 日本語 = japanese 😎");assert(lugcSequence3.chopR(21).encode() == cast(stringUTF)r"äëåčñœß … russian = русский 🇷🇺");

assert(lugcSequence3.chopLasUTF(10) == cast(stringUTF)r"russian = русский 🇷🇺 ≠ 🇯🇵 日本語 = japanese 😎");assert(lugcSequence3.chopRasUTF(21) == cast(stringUTF)r"äëåčñœß … russian = русский 🇷🇺");


   version (useUTF08) { stringUTF08 lstrSequence3reencoded; }
   version (useUTF16) { stringUTF16 lstrSequence3reencoded; }
   version (useUTF32) { stringUTF32 lstrSequence3reencoded; }

   for (
      size_t lintSequenceUGC = cast(size_t) 1;
      lintSequenceUGC <= lintSequence3sizeUGC;
      ++lintSequenceUGC
      ) {

lstrSequence3reencoded ~=lugcSequence3.takeasUTF(lintSequenceUGC);


   }

   assert(lstrSequence3reencoded == lstrSequence3);

   lstrSequence3reencoded = null;

version (useUTF08) { foreach (stringUTF08 lstrSequence3UGC;lugcSequence3) { lstrSequence3reencoded ~= lstrSequence3UGC; } }version (useUTF16) { foreach (stringUTF16 lstrSequence3UGC;lugcSequence3) { lstrSequence3reencoded ~= lstrSequence3UGC; } }version (useUTF32) { foreach (stringUTF32 lstrSequence3UGC;lugcSequence3) { lstrSequence3reencoded ~= lstrSequence3UGC; } }

//assert(lstrSequence3reencoded == lstrSequence3); /// ooops …always missing last grapheme‐cluster: possible bug # 20483


}
```

Re: mixin template's alias parameter ... ignored ?

Reply via email to