Re: How to remove all characters from a string, except the integers?
On Friday, 4 March 2022 at 21:20:20 UTC, Stanislav Blinov wrote: On Friday, 4 March 2022 at 19:51:44 UTC, matheus wrote: OK but there is another problem, I tested your version and mine and there is a HUGE difference in speed: string s, str = "4A0B1de!2C9~6"; Unless I did something wrong (If anything please tell). By the way on DMD was worse, it was like 5x slower in your version. To add to the already-mentioned difference in allocation strategies, try replacing the input with e.g. a command-line argument. Looping over a literal may be skewing the results. Interesting and I'll try that way. Thanks. Matheus.
Re: How to remove all characters from a string, except the integers?
On Friday, 4 March 2022 at 20:38:11 UTC, ag0aep6g wrote: ... The second version involves auto-decoding, which isn't actually needed. You can work around it with `str.byCodeUnit.filter!...`. On my machine, times become the same then. Typical output: str: 401296 Tim(ms): 138 Tim(us): 138505 str: 401296 Tim(ms): 137 Tim(us): 137376 That's awesome my timing are pretty much like yours. In fact now with "byCodeUnit" it's faster. :) Thanks, Matheus.
Re: How to remove all characters from a string, except the integers?
On Friday, 4 March 2022 at 20:33:08 UTC, H. S. Teoh wrote: On Fri, Mar 04, 2022 at 07:51:44PM +, matheus via ... I don't pay any attention to DMD when I'm doing anything remotely performance-related. Its optimizer is known to be suboptimal. :-P Yes, in fact I usually do my coding/compiling with DMD because is faster, then I go for LDC for production and speed. Matheus.
Re: How to remove all characters from a string, except the integers?
On Friday, 4 March 2022 at 19:51:44 UTC, matheus wrote: OK but there is another problem, I tested your version and mine and there is a HUGE difference in speed: string s, str = "4A0B1de!2C9~6"; Unless I did something wrong (If anything please tell). By the way on DMD was worse, it was like 5x slower in your version. To add to the already-mentioned difference in allocation strategies, try replacing the input with e.g. a command-line argument. Looping over a literal may be skewing the results.
Re: How to remove all characters from a string, except the integers?
On Fri, Mar 04, 2022 at 08:38:11PM +, ag0aep6g via Digitalmars-d-learn wrote: [...] > The second version involves auto-decoding, which isn't actually > needed. You can work around it with `str.byCodeUnit.filter!...`. On my > machine, times become the same then. [...] And this here is living proof of why autodecoding is a Bad Idea(tm). Whatever happened to Andrei's std.v2 effort?! The sooner we can shed this baggage, the better. T -- The two rules of success: 1. Don't tell everything you know. -- YHL
Re: How to remove all characters from a string, except the integers?
On Friday, 4 March 2022 at 19:51:44 UTC, matheus wrote: import std.datetime.stopwatch; import std.stdio: write, writeln, writef, writefln; import std; void printStrTim(string s,StopWatch sw){ writeln("\nstr: ", s ,"\nTim(ms): ", sw.peek.total!"msecs" ,"\nTim(us): ", sw.peek.total!"usecs" ); } void main(){ auto sw = StopWatch(AutoStart.no); string s, str = "4A0B1de!2C9~6"; int j; sw.start(); for(j=0;j<1_000_000;++j){ s=""; foreach(i;str){ (i >= '0' && i <= '9') ? s~=i : null; } } sw.stop(); printStrTim(s,sw); s = ""; sw.reset(); sw.start(); for(j=0;j<1_000_000;++j){ s=""; s = str.filter!(ch => ch.isDigit).to!string; } sw.stop(); printStrTim(s,sw); } Prints: str: 401296 Tim(ms): 306 Tim(us): 306653 str: 401296 Tim(ms): 1112 Tim(us): 1112648 --- Unless I did something wrong (If anything please tell). The second version involves auto-decoding, which isn't actually needed. You can work around it with `str.byCodeUnit.filter!...`. On my machine, times become the same then. Typical output: str: 401296 Tim(ms): 138 Tim(us): 138505 str: 401296 Tim(ms): 137 Tim(us): 137376
Re: How to remove all characters from a string, except the integers?
On Fri, Mar 04, 2022 at 07:51:44PM +, matheus via Digitalmars-d-learn wrote: [...] > for(j=0;j<1_000_000;++j){ > s=""; > s = str.filter!(ch => ch.isDigit).to!string; This line allocates a new string for every single loop iteration. This is generally not something you want to do in an inner loop. :-) > } [...] > Unless I did something wrong (If anything please tell). By the way on > DMD was worse, it was like 5x slower in your version. [...] I don't pay any attention to DMD when I'm doing anything remotely performance-related. Its optimizer is known to be suboptimal. :-P T -- Study gravitation, it's a field with a lot of potential.
Re: How to remove all characters from a string, except the integers?
On Thursday, 3 March 2022 at 23:46:49 UTC, H. S. Teoh wrote: ... This version doesn't even allocate extra storage for the filtered digits, since no storage is actually needed (each digit is spooled directly to the output). OK but there is another problem, I tested your version and mine and there is a HUGE difference in speed: LDC 1.27.1, with -O2: import std.datetime.stopwatch; import std.stdio: write, writeln, writef, writefln; import std; void printStrTim(string s,StopWatch sw){ writeln("\nstr: ", s ,"\nTim(ms): ", sw.peek.total!"msecs" ,"\nTim(us): ", sw.peek.total!"usecs" ); } void main(){ auto sw = StopWatch(AutoStart.no); string s, str = "4A0B1de!2C9~6"; int j; sw.start(); for(j=0;j<1_000_000;++j){ s=""; foreach(i;str){ (i >= '0' && i <= '9') ? s~=i : null; } } sw.stop(); printStrTim(s,sw); s = ""; sw.reset(); sw.start(); for(j=0;j<1_000_000;++j){ s=""; s = str.filter!(ch => ch.isDigit).to!string; } sw.stop(); printStrTim(s,sw); } Prints: str: 401296 Tim(ms): 306 Tim(us): 306653 str: 401296 Tim(ms): 1112 Tim(us): 1112648 --- Unless I did something wrong (If anything please tell). By the way on DMD was worse, it was like 5x slower in your version. Matheus.
Re: How to remove all characters from a string, except the integers?
On Thu, Mar 03, 2022 at 06:36:35PM -0800, Ali Çehreli via Digitalmars-d-learn wrote: > On 3/3/22 13:03, H. S. Teoh wrote: > > > string s = "blahblah123blehbleh456bluhbluh"; > > > assert(result == 123456); > > I assumed it would generate separate integers 123 and 456. I started > to implement a range with findSkip, findSplit, and friends but failed. > :/ [...] import std; void main() { string s = "blahblah123blehbleh456bluhbluh"; auto result = s.matchAll(regex(`\d+`)) .each!(m => writeln(m[0])); } Output: 123 456 Takes only 3 lines of code. ;-) T -- People demand freedom of speech to make up for the freedom of thought which they avoid. -- Soren Aabye Kierkegaard (1813-1855)
Re: How to remove all characters from a string, except the integers?
On Friday, 4 March 2022 at 10:34:29 UTC, Ali Çehreli wrote: [...] isMatched() and chunkOf() are not necessary at all. I wanted to use readable names to fields of the elements of chunkBy instead of the cryptic t[0] and t[1]: It's delicious, only four lines: ```d "1,2,3".chunkBy!(n => '0' <= n && n <= '9') .filter!(t => t[0]) .map!(c => c[1]) .writeln; ``` Thank you very much for this information sharing... SDB@79
Re: How to remove all characters from a string, except the integers?
On 3/4/22 01:53, Salih Dincer wrote: > On Friday, 4 March 2022 at 07:55:18 UTC, forkit wrote: >> If you get this question at an interview, please remember to first ask >> whether it's ascii or unicode 😀 > > ```d > auto UTFsample = ` > 1 İş 100€, 1.568,38 Türk Lirası > çarşıda eğri 1 çöp 4lınmaz!`; > > UTFsample.splitNumbers.writeln; // [1, 100, 1, 568, 38, 1, 4] > ``` I think what forkit means is, should the function consider numbers made of non-ascii characters as well? For example, the ones on this page: https://www.fileformat.info/info/unicode/category/Nd/list.htm Typical to any programming task, all of us made assumptions on what actually is needed. :) Ali
Re: How to remove all characters from a string, except the integers?
On 3/3/22 04:14, BoQsc wrote: > and if it contains integers, remove all the regular string characters. Others assumed you wanted integer values but I think you want the digits of the integers. It took me a while to realize that chunkBy can do that: // Convenience functions to tuple members of the result // of chunkBy when used with a unary predicate. auto isMatched(T)(T tuple) { return tuple[0]; } // Ditto auto chunkOf(T)(T tuple) { return tuple[1]; } auto numbers(R)(R range) { import std.algorithm : chunkBy, filter, map; import std.uni : isNumber; return range .chunkBy!isNumber .filter!isMatched .map!chunkOf; } unittest { import std.algorithm : equal, map; import std.conv : text; // "٤٢" is a non-ASCII number example. auto r = "123 ab ٤٢ c 456 xyz 789".numbers; assert(r.map!text.equal(["123", "٤٢", "456", "789"])); } void main() { } isMatched() and chunkOf() are not necessary at all. I wanted to use readable names to fields of the elements of chunkBy instead of the cryptic t[0] and t[1]: return range .chunkBy!isNumber .filter!(t => t[0]) // Not pretty .map!(t => t[1]); // Not pretty Those functions could not be nested functions because otherwise I would have to write e.g. return range .chunkBy!isNumber .filter!(t => isMatched(t)) // Not pretty .map!(t => chunkOf(t)); // Not pretty To get integer values, .to!int would work as long as the numbers consist of ASCII digits. (I am removing ٤٢.) import std.stdio; import std.algorithm; import std.conv; writeln("123 abc 456 xyz 789".numbers.map!(to!int)); Ali
Re: How to remove all characters from a string, except the integers?
On Friday, 4 March 2022 at 07:55:18 UTC, forkit wrote: If you get this question at an interview, please remember to first ask whether it's ascii or unicode 😀 ```d auto UTFsample = ` 1 İş 100€, 1.568,38 Türk Lirası çarşıda eğri 1 çöp 4lınmaz!`; UTFsample.splitNumbers.writeln; // [1, 100, 1, 568, 38, 1, 4] ```
Re: How to remove all characters from a string, except the integers?
On Thursday, 3 March 2022 at 12:14:13 UTC, BoQsc wrote: I need to check if a string contains integers, and if it contains integers, remove all the regular string characters. I've looked around and it seems using regex is the only closest solution. ``` import std.stdio; void main(string[] args){ if (args.length > 1){ write(args[1]); // Needs to print only integers. } else { write("Please write an argument."); } } ``` Regular expression solution ``` import std.stdio; import std.regex; import std.string: isNumeric; import std.conv; void main(string[] args){ if (args.length > 1){ writeln(args[1]); // Needs to print only integers. string argument1 = args[1].replaceAll(regex(r"[^0-9.]","g"), ""); if (argument1.isNumeric){ writeln(std.conv.to!uint(argument1)); } else { writeln("Invalid value: ", args[1]," (must be int integer)"); } } else { write("Please write an argument."); } } ```
Re: How to remove all characters from a string, except the integers?
On Friday, 4 March 2022 at 02:10:11 UTC, Salih Dincer wrote: On Thursday, 3 March 2022 at 20:23:14 UTC, forkit wrote: On Thursday, 3 March 2022 at 19:28:36 UTC, matheus wrote: I'm a simple man who uses D with the old C mentality: [...] ```d string s, str = "4A0B1de!2C9~6"; foreach(i;str){ if(i < '0' || i > '9'){ continue; } s ~= i; } ``` [...] mmm..but we no longer live in simple times ;-) (i.e. unicode) If you look [here](https://github.com/dlang/phobos/blob/master/std/ascii.d#L315), you'll see that it's already the same logic. If it were me I would have even written like this: ```d "4A0B1de!2C9~6".filter!(c => '0' <= c && c <= '9' ).writeln; // 401296 ``` If you get this question at an interview, please remember to first ask whether it's ascii or unicode ;-) " All of the functions in std.ascii accept Unicode characters but effectively ignore them if they're not ASCII." - https://github.com/dlang/phobos/blob/master/std/ascii.d
Re: How to remove all characters from a string, except the integers?
On Friday, 4 March 2022 at 02:36:35 UTC, Ali Çehreli wrote: I assumed it would generate separate integers 123 and 456. I started to implement a range with findSkip, findSplit, and friends but failed. :/ I worked on it a little. I guess it's better that way. But I didn't think about negative numbers. ```d auto splitNumbers(string str) { size_t[] n; int i = -1; bool nextNumber = true; foreach(s; str) { if(s >= '0' && s <= '9') { if(nextNumber) { i++; n.length++; nextNumber = false; } n[i] = 10 * n[i] + (s - '0'); } else nextNumber = true; } return n; } unittest { auto n = splitNumbers(" 1,23, 456\n\r7890..."); assert(n[0] == 1Lu); assert(n[1] == 23Lu); assert(n[2] == 456Lu); assert(n[3] == 7890Lu); } ``` Presumably, D has more active and short possibilities. This is what I can do that making little use of the library. Thank you... SDB@79
Re: How to remove all characters from a string, except the integers?
On Friday, 4 March 2022 at 02:36:35 UTC, Ali Çehreli wrote: On 3/3/22 13:03, H. S. Teoh wrote: >string s = "blahblah123blehbleh456bluhbluh"; >assert(result == 123456); I assumed it would generate separate integers 123 and 456. I started to implement a range with findSkip, findSplit, and friends but failed. :/ Ali It's called hit two targets with one arrow: ```d auto splitNumbers(string str) { size_t[] n = [0]; size_t i; foreach(s; str) { if(s >= '0' && s <= '9') { n[i] = 10 * n[i] + (s - '0'); } else { i++; n.length++; } } return n.filter!(c => c > 0); } void main() { auto s = "abc1234567890def1234567890xyz"; s.splitNumbers.writeln; // [1234567890, 1234567890] } ``` SDB@79
Re: How to remove all characters from a string, except the integers?
On 3/3/22 13:03, H. S. Teoh wrote: >string s = "blahblah123blehbleh456bluhbluh"; >assert(result == 123456); I assumed it would generate separate integers 123 and 456. I started to implement a range with findSkip, findSplit, and friends but failed. :/ Ali
Re: How to remove all characters from a string, except the integers?
On Thursday, 3 March 2022 at 20:23:14 UTC, forkit wrote: On Thursday, 3 March 2022 at 19:28:36 UTC, matheus wrote: I'm a simple man who uses D with the old C mentality: [...] ```d string s, str = "4A0B1de!2C9~6"; foreach(i;str){ if(i < '0' || i > '9'){ continue; } s ~= i; } ``` [...] mmm..but we no longer live in simple times ;-) (i.e. unicode) If you look [here](https://github.com/dlang/phobos/blob/master/std/ascii.d#L315), you'll see that it's already the same logic. If it were me I would have even written like this: ```d "4A0B1de!2C9~6".filter!(c => '0' <= c && c <= '9' ).writeln; // 401296 ```
Re: How to remove all characters from a string, except the integers?
On Thu, Mar 03, 2022 at 10:54:39PM +, matheus via Digitalmars-d-learn wrote: > On Thursday, 3 March 2022 at 21:03:40 UTC, H. S. Teoh wrote: [...] > > -- > > void main() { > > string s = "blahblah123blehbleh456bluhbluh"; > > auto result = s.filter!(ch => ch.isDigit).to!int; > > assert(result == 123456); > > } > > -- [...] > PS: I spotted something on your code, you're converting the result to > int, this can lead to a overflow depending the values in the string. If you need to, convert to long instead. Or if you want a string for subsequent manipulation, replace `int` with `string`. Or, if you don't actually need to manipulate the value at all, but just print the digits, then it becomes even simpler: void main() { string s = "blahblah123blehbleh456bluhbluh"; writeln(s.filter!(ch => ch.isDigit)); } This version doesn't even allocate extra storage for the filtered digits, since no storage is actually needed (each digit is spooled directly to the output). T -- The peace of mind---from knowing that viruses which exploit Microsoft system vulnerabilities cannot touch Linux---is priceless. -- Frustrated system administrator.
Re: How to remove all characters from a string, except the integers?
On Thursday, 3 March 2022 at 21:03:40 UTC, H. S. Teoh wrote: ... -- void main() { string s = "blahblah123blehbleh456bluhbluh"; auto result = s.filter!(ch => ch.isDigit).to!int; assert(result == 123456); } -- Problem solved. Why write 6 lines when 3 will do? Just because I'm a simple man. :) I usually program mostly in C and when in D, I go in the same way but using features like: GC, strings, AA etc. Of course your version is a D'ish way of handling things, and I can't contest it looks better visually. But if size was problem I could have written: void main(){ string s, str = "4A0B1de!2C9~6"; foreach(i;str){ (i >= '0' && i <= '9') ? s~=i : null; } writeln(s); } Well still 1 line off, but I goes with my flow. I mean this example is a simple one, but usually I can see and understand what a code in C is doing (more) easily than D just looking at it. Don't even ask about C++, because I gave up. :) Matheus. PS: I spotted something on your code, you're converting the result to int, this can lead to a overflow depending the values in the string.
Re: How to remove all characters from a string, except the integers?
On Thu, Mar 03, 2022 at 08:23:14PM +, forkit via Digitalmars-d-learn wrote: > On Thursday, 3 March 2022 at 19:28:36 UTC, matheus wrote: > > > > I'm a simple man who uses D with the old C mentality: > > > > import std.stdio; > > > > void main(){ > > string s, str = "4A0B1de!2C9~6"; > > foreach(i;str){ > > if(i < '0' || i > '9'){ continue; } > > s ~= i; > > } > > writeln("Result: ", s); > > } > > > > Result: 401296 > > > > Matheus. > > mmm..but we no longer live in simple times ;-) > > (i.e. unicode) -- void main() { string s = "blahblah123blehbleh456bluhbluh"; auto result = s.filter!(ch => ch.isDigit).to!int; assert(result == 123456); } -- Problem solved. Why write 6 lines when 3 will do? T -- People tell me that I'm skeptical, but I don't believe them.
Re: How to remove all characters from a string, except the integers?
On Thursday, 3 March 2022 at 19:28:36 UTC, matheus wrote: I'm a simple man who uses D with the old C mentality: import std.stdio; void main(){ string s, str = "4A0B1de!2C9~6"; foreach(i;str){ if(i < '0' || i > '9'){ continue; } s ~= i; } writeln("Result: ", s); } Result: 401296 Matheus. mmm..but we no longer live in simple times ;-) (i.e. unicode)
Re: How to remove all characters from a string, except the integers?
On Thursday, 3 March 2022 at 12:14:13 UTC, BoQsc wrote: I've looked around and it seems using regex is the only closest solution. I'm a simple man who uses D with the old C mentality: import std.stdio; void main(){ string s, str = "4A0B1de!2C9~6"; foreach(i;str){ if(i < '0' || i > '9'){ continue; } s ~= i; } writeln("Result: ", s); } Result: 401296 Matheus.
Re: How to remove all characters from a string, except the integers?
On Thursday, 3 March 2022 at 13:55:47 UTC, BoQsc wrote: On Thursday, 3 March 2022 at 13:25:32 UTC, Stanislav Blinov wrote: On Thursday, 3 March 2022 at 12:14:13 UTC, BoQsc wrote: I need to check if a string contains integers, and if it contains integers, remove all the regular string characters. I've looked around and it seems using regex is the only closest solution. ```d import std.stdio; import std.algorithm : find, filter; import std.conv : to; import std.uni : isNumber; void main(string[] args){ if (args.length > 1){ auto filtered = () { auto r = args[1].find!isNumber; // check if a string contains integers return r.length ? r.filter!isNumber.to!string // and if it does, keep only integers : args[1]; // otherwise keep original } (); filtered.writeln; } else { write("Please write an argument."); } } ``` D language should be renamed into Exclamation-mark language. It feels overused everywhere and without a better alternative. But you have no problem with parenthesis and braces?
Re: How to remove all characters from a string, except the integers?
On Thursday, 3 March 2022 at 13:25:32 UTC, Stanislav Blinov wrote: auto filtered = () { auto r = args[1].find!isNumber; // check if a string contains integers ``` **When using ```find!isNumber```:** ``` 0123456789 @ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_ `abcdefghijklmnopqrstuvwxyz{|}~ ``` **When using ```find!isAlphaNum```:** ``` 0123456789 ```
Re: How to remove all characters from a string, except the integers?
On Thursday, 3 March 2022 at 13:25:32 UTC, Stanislav Blinov wrote: On Thursday, 3 March 2022 at 12:14:13 UTC, BoQsc wrote: I need to check if a string contains integers, and if it contains integers, remove all the regular string characters. I've looked around and it seems using regex is the only closest solution. ```d import std.stdio; import std.algorithm : find, filter; import std.conv : to; import std.uni : isNumber; void main(string[] args){ if (args.length > 1){ auto filtered = () { auto r = args[1].find!isNumber; // check if a string contains integers return r.length ? r.filter!isNumber.to!string // and if it does, keep only integers : args[1]; // otherwise keep original } (); filtered.writeln; } else { write("Please write an argument."); } } ``` D language should be renamed into Exclamation-mark language. It feels overused everywhere and without a better alternative.
Re: How to remove all characters from a string, except the integers?
On Thursday, 3 March 2022 at 12:14:13 UTC, BoQsc wrote: I need to check if a string contains integers, and if it contains integers, remove all the regular string characters. I've looked around and it seems using regex is the only closest solution. ```d import std.stdio; import std.algorithm : find, filter; import std.conv : to; import std.uni : isNumber; void main(string[] args){ if (args.length > 1){ auto filtered = () { auto r = args[1].find!isNumber; // check if a string contains integers return r.length ? r.filter!isNumber.to!string // and if it does, keep only integers : args[1]; // otherwise keep original } (); filtered.writeln; } else { write("Please write an argument."); } } ```
How to remove all characters from a string, except the integers?
I need to check if a string contains integers, and if it contains integers, remove all the regular string characters. I've looked around and it seems using regex is the only closest solution. ``` import std.stdio; void main(string[] args){ if (args.length > 1){ write(args[1]); // Needs to print only integers. } else { write("Please write an argument."); } } ```