Re: that is bug?
On Sunday, 8 April 2018 at 10:03:33 UTC, kdevel wrote: On Sunday, 8 April 2018 at 07:22:19 UTC, Patrick Schluter wrote: You may find an in-depth discussion of the C++ case in https://stackoverflow.com/questions/7499400/ternary-conditional-and-assignment-operator-precedence My formulation was ambiguous, it is the same precedence as the link says. The link also says that's it's right to left evaluation. This means that for expression: a ? b = c : d = e; right to left evaluation will make the = e assignment higher priority than the b = c assignment or the ternary even if they have the same priority level. To summarize: C++ works as expected and C prevents the assigment because the conditional operator does not yield an l-value: Exactly Now, the only thing is to clearly define what it is in D, as apparently it is neither the C++ nor the C behaviour. The old precedence table on the D wiki seems to say it is like C, but the example of that thread seems to show it's not.
Re: that is bug?
On Sunday, 8 April 2018 at 16:47:59 UTC, Patrick Schluter wrote: On Sunday, 8 April 2018 at 10:03:33 UTC, kdevel wrote: On Sunday, 8 April 2018 at 07:22:19 UTC, Patrick Schluter wrote: [...] To summarize: C++ works as expected and C prevents the assigment because the conditional operator does not yield an l-value: Exactly Now, the only thing is to clearly define what it is in D, as apparently it is neither the C++ nor the C behaviour. The old precedence table on the D wiki seems to say it is like C, but the example of that thread seems to show it's not. To follow up. What's surprizing for a C guy like me is that D accepts without problems (a=1)=2; i.e. that (a=1) is a lvalue.
Re: that is bug?
On Monday, 9 April 2018 at 14:28:54 UTC, MattCoder wrote: On Monday, 9 April 2018 at 03:35:07 UTC, Ali Çehreli wrote: ... I don't have any problem with that part either. The following makes sense to me. I may have even used it in the past (likely in C++): (cond ? a : b) = foo; ... For me as a C programmer this is different. What happens in this case? It will be assign foo to either a or b? Except that it is not allowed in standard C. gcc says error: lvalue required as left operand of assignment to that.
Re: Small Buffer Optimization for string and friends
On Sunday, 8 April 2012 at 09:46:28 UTC, Vladimir Panteleev wrote: On Sunday, 8 April 2012 at 05:56:36 UTC, Andrei Alexandrescu wrote: Walter and I discussed today about using the small string optimization in string and other arrays of immutable small objects. On 64 bit machines, string occupies 16 bytes. We could use the first byte as discriminator, which means that all strings under 16 chars need no memory allocation at all. Don't use the first byte. Use the last byte. The last byte is the highest-order byte of the length. Limiting arrays to 18.37 exabytes, as opposed to 18.45 exabytes, is a much nicer limitation than making assumptions about the memory layout. If the length has multi purpose it would be even better to reserve more than just one bit. For all practical purpose 48 bits or 56 bits are more than enough to handle all possible lengths. This would liberate 8 or even 16 bits that can be used for other purposes.
Re: core.stdc and betterC
On Sunday, 29 April 2018 at 15:40:20 UTC, Jacob Carlborg wrote: On 2018-04-29 16:42, dd886k wrote: [...] Looks like "putchar" is inlined [1]. That means the "putchar" you're referencing is not the one in the C standard library but it's implemented in druntime. That means you need to link with druntime/phobos, it's not enough to link with the C standard library. I don't know why it was done this way. Perhaps it's just a macro on some platforms. [1] https://github.com/dlang/druntime/blob/master/src/core/stdc/stdio.d#L1289 Yes, putchar is often implemented as a the macro #define putchar(x) putc(x, stdout)
Re: Sealed classes - would you want them in D?
On Tuesday, 15 May 2018 at 02:32:05 UTC, KingJoffrey wrote: On Tuesday, 15 May 2018 at 02:00:17 UTC, 12345swordy wrote: On Tuesday, 15 May 2018 at 00:28:42 UTC, KingJoffrey wrote: On Monday, 14 May 2018 at 19:40:18 UTC, 12345swordy wrote: [...] If 'getting a module to respect the enscapsulation boundaries the programmer puts in place would change the language so 'fundamentally', then the language 'already' presents big problems for large complex application development. Evidence for this claim please. - Object independence - Do not violate encapsulation - Respect the interface All large software projects are done in (or moving toward) languages that respect these idioms. Those that don't, are the ones we typically have problems with. Isn't that evidence enough? That's not evidence, that's pure opinion. There's not a shred of data in that list.
Re: Of possible interest: fast UTF8 validation
On Thursday, 17 May 2018 at 05:01:54 UTC, Joakim wrote: On Wednesday, 16 May 2018 at 20:11:35 UTC, Andrei Alexandrescu wrote: On 5/16/18 1:18 PM, Joakim wrote: On Wednesday, 16 May 2018 at 16:48:28 UTC, Dmitry Olshansky wrote: On Wednesday, 16 May 2018 at 15:48:09 UTC, Joakim wrote: On Wednesday, 16 May 2018 at 11:18:54 UTC, Andrei Alexandrescu wrote: https://www.reddit.com/r/programming/comments/8js69n/validating_utf8_strings_using_as_little_as_07/ Sigh, this reminds me of the old quote about people spending a bunch of time making more efficient what shouldn't be done at all. Validating UTF-8 is super common, most text protocols and files these days would use it, other would have an option to do so. I’d like our validateUtf to be fast, since right now we do validation every time we decode string. And THAT is slow. Trying to not validate on decode means most things should be validated on input... I think you know what I'm referring to, which is that UTF-8 is a badly designed format, not that input validation shouldn't be done. I find this an interesting minority opinion, at least from the perspective of the circles I frequent, where UTF8 is unanimously heralded as a great design. Only a couple of weeks ago I saw Dylan Beattie give a very entertaining talk on exactly this topic: https://dotnext-piter.ru/en/2018/spb/talks/2rioyakmuakcak0euk0ww8/ Thanks for the link, skipped to the part about text encodings, should be fun to read the rest later. If you could share some details on why you think UTF8 is badly designed and how you believe it could be/have been better, I'd be in your debt! Unicode was a standardization of all the existing code pages and then added these new transfer formats, but I have long thought that they'd have been better off going with a header-based format that kept most languages in a single-byte scheme, This is not practical, sorry. What happens when your message loses the header? Exactly, the rest of the message is garbled. That's exactly what happened with code page based texts when you don't know in which code page it is encoded. It has the supplemental inconvenience that mixing languages becomes impossible or at least very cumbersome. UTF-8 has several properties that are difficult to have with other schemes. - It is state-less, means any byte in a stream always means the same thing. Its meaning does not depend on external or a previous byte. - It can mix any language in the same stream without acrobatics and if one thinks that mixing languages doesn't happen often should get his head extracted from his rear, because it is very common (check wikipedia's front page for example). - The multi byte nature of other alphabets is not as bad as people think because texts in computer do not live on their own, meaning that they are generally embedded inside file formats, which more often than not are extremely bloated (xml, html, xliff, akoma ntoso, rtf etc.). The few bytes more in the text do not weigh that much. I'm in charge at the European Commission of the biggest translation memory in the world. It handles currently 30 languages and without UTF-8 and UTF-16 it would be unmanageable. I still remember when I started there in 2002 when we handled only 11 languages of which only 1 was of another alphabet (Greek). Everything was based on RTF with codepages and it was a braindead mess. My first job in 2003 was to extend the system to handle the 8 newcomer languages and with ASCII based encodings it was completely unmanageable because every document processed mixes languages and alphabets freely (addresses and names are often written in their original form for instance). 2 years ago we implemented also support for Chinese. The nice thing was that we didn't have to change much to do that thanks to Unicode. The second surprise was with the file sizes, Chinese documents were generally smaller than their European counterparts. Yes CJK requires 3 bytes for each ideogram, but generally 1 ideogram replaces many letters. The ideogram 亿 replaces "One hundred million" for example, which of them take more bytes? So if CJK indeed requires more bytes to encode, it is firstly because they NEED many more bits in the first place (there are around 3 CJK codepoints in the BMP alone, add to it the 6 that are in the SIP and we have a need of 17 bits only to encode them. as they mostly were except for obviously the Asian CJK languages. That way, you optimize for the common string, ie one that contains a single language or at least no CJK, rather than pessimizing every non-ASCII language by doubling its character width, as UTF-8 does. This UTF-8 issue is one of the first topics I raised in this forum, but as you noted at the time nobody agreed and I don't want to dredge that all up again. I have been researching this a bit since then, and the stated goals for UTF-8 at inception were that it _could not overlap
Re: Of possible interest: fast UTF8 validation
On Thursday, 17 May 2018 at 15:37:01 UTC, Andrei Alexandrescu wrote: On 05/17/2018 09:14 AM, Patrick Schluter wrote: I'm in charge at the European Commission of the biggest translation memory in the world. Impressive! Is that the Europarl? No, Euramis. The central translation memory developed by the Commission and used also by the other institutions. The database contains more than a billion segments from parallel texts and is afaik the biggest of its kind. One of the big strength of the Euramis TM is its multi-target language store this allows fuzzy searches in all combinations including indirect translations (i.e. if a document written in english was translated in Romanian and in Maltese it is then possible to search for alignments between ro and mt). It's not the only system to do that but on that volume it is quite unique. We publish also every year an extract of it of the published legislation [1] from the official journal so that they can be used by the research community. All the machine translation engines use it. It is one of most accessed data collection on the European Open Data portal [2]. The very uncommon thing about the backend software of EURAMIS is that it is written in C. Pure unadultered C. I'm trying to introduce D but with the strange (to say it politely) configurations our server have it is quite challenging. [1]: https://ec.europa.eu/jrc/en/language-technologies/dgt-translation-memory [2]: http://data.europa.eu/euodp/fr/data
Re: Of possible interest: fast UTF8 validation
On Thursday, 17 May 2018 at 15:16:19 UTC, Joakim wrote: On Thursday, 17 May 2018 at 13:14:46 UTC, Patrick Schluter wrote: This is not practical, sorry. What happens when your message loses the header? Exactly, the rest of the message is garbled. Why would it lose the header? TCP guarantees delivery and checksums the data, that's effective enough at the transport layer. What does TCP/IP got to do with anything in discussion here. UTF-8 (or UTF-16 or UTF-32) has nothing to do with network protocols. That's completely unrelated. A file encoded on a disk may never leave the machine it is written on and may never see a wire in its lifetime and its encoding is still of vital importance. That's why a header encoding is too restrictive. I agree that UTF-8 is a more redundant format, as others have mentioned earlier, and is thus more robust to certain types of data loss than a header-based scheme. However, I don't consider that the job of the text format, it's better done by other layers, like transport protocols or filesystems, which will guard against such losses much more reliably and efficiently. No. A text format cannot depend on a network protocol. It would be as if you could only listen to a music file or a video on streaming and never save it on offline file as there was nowhere the information of what that blob of bytes represents. It doesn't make any sense. For example, a random bitflip somewhere in the middle of a UTF-8 string will not be detectable most of the time. However, more robust error-correcting schemes at other layers of the system will easily catch that. That's the job of the other layers. Any other file would have the same problem. At least, with utf-8 there will be at most only ever 1 codepoint lost or changed. Any other encoding would fare better. This said if a checksum header for your document is important you can add it to externally anyway. That's exactly what happened with code page based texts when you don't know in which code page it is encoded. It has the supplemental inconvenience that mixing languages becomes impossible or at least very cumbersome. UTF-8 has several properties that are difficult to have with other schemes. - It is state-less, means any byte in a stream always means the same thing. Its meaning does not depend on external or a previous byte. I realize this was considered important at one time, but I think it has proven to be a bad design decision, for HTTP too. There are some advantages when building rudimentary systems with crude hardware and lots of noise, as was the case back then, but that's not the tech world we live in today. That's why almost every HTTP request today is part of a stateful session that explicitly keeps track of the connection, whether through cookies, https encryption, or HTTP/2. Again, orthogonal to utf-8. When I speak above of streams it doesn't limit to sockets, file are also read in streams. So stop of equating UTF-8 with the Internet, these are 2 different domains. Internet and its protocols were defined and invented long before Unicode and Unicode is very usefull also offline. - It can mix any language in the same stream without acrobatics and if one thinks that mixing languages doesn't happen often should get his head extracted from his rear, because it is very common (check wikipedia's front page for example). I question that almost anybody needs to mix "streams." As for messages or files, headers handle multiple language mixing easily, as noted in that earlier thread. Ok, show me how you transmit that, I'm curious: E2010C0002 EFTA Surveillance Authority Decision Beschluss der EFTA-Überwachungsbehörde EFTA-Tilsynsmyndighedens beslutning Απόφαση της Εποπτεύουσας Αρχής της ΕΖΕΣ Decisión del Órgano de Vigilancia de la AELC EFTAn valvontaviranomaisen päätös Décision de l'Autorité de surveillance AELE Decisione dell’Autorità di vigilanza EFTA Besluit van de Toezichthoudende Autoriteit van de EVA Decisão do Órgão de Fiscalização da EFTA Beslut av Eftas övervakningsmyndighet EBTA Uzraudzības iestādes Lēmums Rozhodnutí Kontrolního úřadu ESVO EFTA järelevalveameti otsus Decyzja Urzędu Nadzoru EFTA Odločba Nadzornega organa EFTE ELPA priežiūros institucijos sprendimas Deċiżjoni tal-Awtorità tas-Sorveljanza tal-EFTA Rozhodnutie Dozorného orgánu EZVO Решение на Надзорния орган на ЕАСТ - The multi byte nature of other alphabets is not as bad as people think because texts in computer do not live on their own, meaning that they are generally embedded inside file formats, which more often than not are extremely bloated (xml, html, xliff, akoma ntoso, rtf etc.). The few bytes more in the text do not weigh that much. Heh, the other parts of the tech stack are much more bloated, so this bloat is okay? A unique argument, but I'd argue that's why those bloated formats you mention are largely dying off too. They don't, i
Re: Of possible interest: fast UTF8 validation
On Thursday, 17 May 2018 at 23:16:03 UTC, H. S. Teoh wrote: On Thu, May 17, 2018 at 07:13:23PM +, Patrick Schluter via Digitalmars-d wrote: [...] [...] Yes. Imagine if we standardized on a header-based string encoding, and we wanted to implement a substring function over a string that contains multiple segments of different languages. Instead of a cheap slicing over the string, you'd need to scan the string or otherwise keep track of which segment the start/end of the substring lies in, allocate memory to insert headers so that the segments are properly interpreted, etc.. It would be an implementational nightmare, and an unavoidable performance hit (you'd have to copy data every time you take a substring), and the @nogc guys would be up in arms. [...] That's what rtf with code pages was essentially. I'm happy that we got rid of it and that they were replaced by xml, even if Microsoft's document xml being a bloated, ridiculous mess, it's still an order of magnitude less problematic than rtf (I mean at the text encoding level).
Re: Online impersonation
On Thursday, 24 May 2018 at 06:32:23 UTC, Dukc wrote: On Wednesday, 23 May 2018 at 17:31:40 UTC, Steven Schveighoffer wrote: The IP address is included in the headers of the newsgroup. All of them came from the same IP. I have a filter on my thunderbird client to flag certain IPs, and his was added to the list recently. Then again, it's possible they're family members or neighbours using the same IP. How likely this is, I won't comment. I don't this is a case of inpersonation if you're right, since the aliases have not been trying to inpersonate any real, exact person. But dishonourable action nonetheless. It was quite obvious that KingJoffrey could be the sock puppeteer as he was childish and unreasonable all along the thread about private class members.
Re: Ideas for students' summer projects
On Thursday, 24 May 2018 at 18:07:53 UTC, Patrick Schluter wrote: On Wednesday, 23 May 2018 at 01:33:19 UTC, Mike Franklin wrote: On Tuesday, 22 May 2018 at 16:27:05 UTC, Eduard Staniloiu wrote: Let the brainstorming begin! I would like to see a dependency-less Phobos-like library that can be used by the DMD compiler, druntime, -betterC, and other runtime-less/phobos-less use cases. It would have no dependencies whatsoever. As a contrived illustration, take a look at the code in https://github.com/dlang/druntime/blob/master/src/core/internal/string.d Those same features are also in Phobos. OT, numberDigits enters an infinite loop if radix == 1. and crashes for radix == 0
Re: Ideas for students' summer projects
On Wednesday, 23 May 2018 at 01:33:19 UTC, Mike Franklin wrote: On Tuesday, 22 May 2018 at 16:27:05 UTC, Eduard Staniloiu wrote: Let the brainstorming begin! I would like to see a dependency-less Phobos-like library that can be used by the DMD compiler, druntime, -betterC, and other runtime-less/phobos-less use cases. It would have no dependencies whatsoever. As a contrived illustration, take a look at the code in https://github.com/dlang/druntime/blob/master/src/core/internal/string.d Those same features are also in Phobos. OT, numberDigits enters an infinite loop if radix == 1.
Re: Morale of a story: ~ is the right choice for concat operator
On Friday, 25 May 2018 at 23:05:51 UTC, Jonathan M Davis wrote: Sure, it can be argued that this should be unnecessary and that the programmer should just get it right, but it's not an altogether uncommon bug to screw up case statements and invadvertently fall through to the next one when you meant to put a break or some other control statement there. Originally, implicit fallthrough was perfectly legal in D just like it is in C or C++. However, when it was made illegal, it caught quite a few bugs in existing programs - including at companies using D. This change to the language fixed bugs and almost certainly saved people time and money. and that the issue is real in C is also illustrated by the fact that gcc now warns about implicit fallthrough since version 7. One has to add at least a comment to suppress the warning (btw the implementation of the heuristic to analyse the comments is more or less broken, I had to file my first bug report to gcc about it).
Re: Replacing C's memcpy with a D implementation
On Sunday, 10 June 2018 at 13:45:54 UTC, Mike Franklin wrote: On Sunday, 10 June 2018 at 13:16:21 UTC, Adam D. Ruppe wrote: memcpyD: 1 ms, 725 μs, and 1 hnsec memcpyD2: 587 μs and 5 hnsecs memcpyASM: 119 μs and 5 hnsecs Still, the ASM version is much faster. rep movsd is very CPU dependend and needs some precondtions to be fast. For relative short memory blocks it sucks on many other CPU than the last Intel. See what Agner Fog has to say about it: 16.10 String instructions (all processors) String instructions without a repeat prefix are too slow and should be replaced by simpler instructions. The same applies to LOOP on all processors and to JECXZ on some processors. REP MOVSD andREP STOSD are quite fast if the repeat count is not too small. Always use the largest word size possible (DWORDin 32-bit mode, QWORD in 64-bit mode), and make sure that both source and destination are aligned by the word size. In many cases, however, it is faster to use XMM registers. Moving data in XMM registers is faster than REP MOVSD and REP STOSD in most cases, especially on older processors. See page 164 for details. Note that while the REP MOVS instruction writes a word to the destination, it reads the next word from the source in the same clock cycle. You can have a cache bank conflict if bit 2-4 are the same in these two addresses on P2 and P3. In other words, you will get a penalty of one clock extra per iteration if ESI +WORDSIZE-EDI is divisible by 32. The easiest way to avoid cache bank conflicts is to align both source and destination by 8. Never use MOVSB or MOVSW in optimized code, not even in 16-bit mode. On many processors, REP MOVS and REP STOS can perform fast by moving 16 bytes or an entire cache line at a time . This happens only when certain conditions are met. Depending on the processor, the conditions for fast string instructions are, typically, that the count must be high, both source and destination must be aligned, the direction must be forward, the distance between source and destination must be at least the cache line size, and the memory type for both source and destination must be either write-back or write-combining (you can normally assume the latter condition is met). Under these conditions, the speed is as high as you can obtain with vector register moves or even faster on some processors. While the string instructions can be quite convenient, it must be emphasized that other solutions are faster in many cases. If the above conditions for fast move are not met then there is a lot to gain by using other methods. See page 164 for alternatives to REP MOVS
Re: DIP 1015--removal of implicit conversion from integer and character literals to bool--Community Review Round 1
On Wednesday, 20 June 2018 at 08:16:21 UTC, Mike Parker wrote: This is the feedback thread for the first round of Community Review for DIP 1015, "Deprecation and removal of implicit conversion from integer and character literals to bool": https://github.com/dlang/DIPs/blob/7c2c39243d0d747191f05fb08f87e1ebcb575d84/DIPs/DIP1015.md All review-related feedback on and discussion of the DIP should occur in this thread. The review period will end at 11:59 PM ET on July 4, or when I make a post declaring it complete. At the end of Round 1, if further review is deemed necessary, the DIP will be scheduled for another round. Otherwise, it will be queued for the Final Review and Formal Assessment by the language maintainers. Please familiarize yourself with the documentation for the Community Review before participating. https://github.com/dlang/DIPs/blob/master/PROCEDURE.md#community-review Just a little remark in the Rationale section. C does have a bool alias of _Bool type and false and true keywords since C99. So it is wrong to say C doesn't have them. Doesn't change anything on the rest of the paper but is it better to not propagate falsehoods.
Re: Sign the installers
On Thursday, 28 June 2018 at 01:34:22 UTC, Jonathan M Davis wrote: On Wednesday, June 27, 2018 17:59:42 Brad Roberts via Digitalmars-d wrote: On 6/27/2018 5:34 PM, Jonathan M Davis via Digitalmars-d wrote: > On Wednesday, June 27, 2018 17:26:36 Manu via Digitalmars-d > wrote: >> I guess people feel nervous about installing allegedly >> potentially dangerous software on their corporate >> workstation. > > Honestly, that's exactly the sort of thing that I always > ignore. I'd pay > attention if anti-virus software outright said that it found > a virus, > but > "unrecognized software?" That's exactly the sort of thing > that's just > going to get me pissed off at Microsoft for getting in my > way. Though > honestly, Microsoft pops up so many useless messages that it > becomes > easy to miss any that actually matter, because you have to > skip through > so many of them all the time that you stop paying attention > to them. > So, I'm definitely surprised to hear about programmers > refusing to > install something just because Microsoft doesn't recognize > it. > > - Jonathan M Davis It's all about removing resistance and raising the level of professionalism. D isn't a hobby project and shouldn't act like one. This is an obvious barrier that's worth removing. In this day and age of rampant actively dangerous software, it's an obvious improvement to sign it and make the strong claim that this is produced and vended by the d foundation and we vouch for it's contents. We already do for some (all?) of the posix distribution bundles. Well, as I said in my initial response, I have no problem with the installer being signed. I'm just surprised that any programmers would care. The issue in professional setting is not just necessarily about the programmer himself but the policies of its company or the IT team in charge of the devs PC. As stated elsewhere, I work in a public adminsitration and the IT is handled by another directorate than the directorate I work for. The IT department is in charge of more than 15,000 PC's. You can imagine that they do everything to have their control over that fleet by normalising and tightening policies. They acknowledge that the developpers need a little bit more leverage and freedom on their machines by providing some local admin rights, but even with that, it is sometime quite difficult to install anything not from the official approved list. Unfortunately, D has been quite annoying to install. The last version i.e. 2.080 for instance didn't install as there is one of the binaries that get quarantained by the anti-virus. Anti-virus I cannot influence because local admin rights are not sufficient to whitelist a file. Installing 64 bit code is also a chore as dmd delegates the installation of the required libs to the Microsoft installer. The problem, the Microsoft installer is incapable to get through our proxy and there's no offline installation option anymore since 2017. I know it's a Microsoft issue, but it is part of the things that makes using D quite challenging. I'm highy motivated and am not pressed by deadlines so it doesn't bother me too much, but I can imagine that somehow reluctant devs will stop at the first hurdle encountered.
Re: Sutter's ISO C++ Trip Report - The best compliment is when someone else steals your ideas....
On Friday, 13 July 2018 at 20:12:36 UTC, Steven Schveighoffer wrote: On 7/13/18 3:53 PM, Paolo Invernizzi wrote: On Friday, 13 July 2018 at 13:15:39 UTC, Steven Schveighoffer wrote: On 7/13/18 8:55 AM, Adam D. Ruppe wrote: [...] But it doesn't scale if you use OS processes, it's too heavyweight. Of course, it depends on the application. If you only need 100 concurrent connections, processes might be OK. Came on, Steve... 100 concurrent connections? Huh? What'd I say? orders of magnitudes too small. 100 concurrent connections you can handle with a sleeping arduino... :-)
Re: Truncate is missing from std.stdio.File, will this do the trick?
On Tuesday, 24 July 2018 at 00:15:37 UTC, spikespaz wrote: I needed a truncate function on the `std.stdio.File` object, so I made this function. Does it look okay? Are there any cross-platform improvements you can think of that should be added? import std.stdio: File; void truncate(File file, long offset) { version (Windows) { import core.sys.windows.windows: SetEndOfFile; file.seek(offset); SetEndOfFile(file.windowsHandle()); } version (Posix) { import core.sys.posix.unistd: ftruncate; ftruncate(file.fileno(), offset); } } Error handling is completely missing. It should throw a FileException or something when encountering an error, and there can be a lot of errors. Here the list of errno errors that ftruncate() can fail with: EFBIG, EINTR, EINVAL, EIO, EISDIR, EPERM, EROFS, ETXTBSY and EBADF. for Windows it will be probably quite similar.
Re: Comparing D vs C++ (wierd behaviour of C++)
On Tuesday, 24 July 2018 at 14:08:26 UTC, Daniel Kozak wrote: I am not C++ expert so this seems wierd to me: #include #include using namespace std; int main(int argc, char **argv) { char c = 0xFF; std::string sData = {c,c,c,c}; unsigned int i = (sData[0]&0xFF)*256 + (sData[1]&0xFF))*256) + (sData[2]&0xFF))*256 + (sData[3]&0xFF)); if (i != 0x) { // it is true why? // this print 18446744073709551615 wow std::cout << "WTF: " << i << std::endl; } return 0; } compiled with: g++ -O2 -Wall -o "test" "test.cxx" when compiled with -O0 it works as expected Vs. D: import std.stdio; void main(string[] args) { char c = 0xFF; string sData = [c,c,c,c]; uint i = (sData[0]&0xFF)*256 + (sData[1]&0xFF))*256) + (sData[2]&0xFF))*256 + (sData[3]&0xFF)); if (i != 0x) { // is false - make sense writefln("WTF: %d", i); } } int promotion rule. char is signed. The 256 are signed. When the result goes above INT_MAX it overflows (i.e. we're in UB territory) and the result can be anything. The registers of the CPUs are 64 bit wide so it sign extends the calculation and as the optimization removes the truncating memory write and reload, the value of the complete register is then printed by the cout>>. Conclusion: typical C(++) undefined behavior due to signed value overflow. Fix: 256u and always compile with -ftrapv . In your case it would have catched the overflow. In D, signed overflow is not UB so everything works as planned. compiled with: dmd -release -inline -boundscheck=off -w -of"test" "test.d" So it is code gen bug on c++ side, or there is something wrong with that code.
Re: Comparing D vs C++ (wierd behaviour of C++)
On Tuesday, 24 July 2018 at 14:41:17 UTC, Ecstatic Coder wrote: On Tuesday, 24 July 2018 at 14:08:26 UTC, Daniel Kozak wrote: I am not C++ expert so this seems wierd to me: #include #include using namespace std; int main(int argc, char **argv) { char c = 0xFF; std::string sData = {c,c,c,c}; unsigned int i = (sData[0]&0xFF)*256 + (sData[1]&0xFF))*256) + (sData[2]&0xFF))*256 + (sData[3]&0xFF)); if (i != 0x) { // it is true why? // this print 18446744073709551615 wow std::cout << "WTF: " << i << std::endl; } return 0; } compiled with: g++ -O2 -Wall -o "test" "test.cxx" when compiled with -O0 it works as expected Vs. D: import std.stdio; void main(string[] args) { char c = 0xFF; string sData = [c,c,c,c]; uint i = (sData[0]&0xFF)*256 + (sData[1]&0xFF))*256) + (sData[2]&0xFF))*256 + (sData[3]&0xFF)); if (i != 0x) { // is false - make sense writefln("WTF: %d", i); } } compiled with: dmd -release -inline -boundscheck=off -w -of"test" "test.d" So it is code gen bug on c++ side, or there is something wrong with that code. As the C++ char are signed by default, when you accumulate several shifted 8 bit -1 into a char result and then store it in a 64 bit unsigned buffer, you get -1 in 64 bits : 18446744073709551615. That's not exactly what happens here. There's no 64 bit buffer. It's signed overflow which is undefined behavior in C and C++. He gets different results with and without optimization because without optimization the result of the calculation is spilled to the i unsigned int and then reloaded for the print call. This save and reload truncated the value to its real value. In the optimized version, the compiler removed the spill and the overflowed value contained in the register is printed as is.
Re: Comparing D vs C++ (wierd behaviour of C++)
On Tuesday, 24 July 2018 at 19:24:05 UTC, Ecstatic Coder wrote: On Tuesday, 24 July 2018 at 15:08:35 UTC, Patrick Schluter wrote: On Tuesday, 24 July 2018 at 14:41:17 UTC, Ecstatic Coder wrote: On Tuesday, 24 July 2018 at 14:08:26 UTC, Daniel Kozak wrote: I am not C++ expert so this seems wierd to me: #include #include using namespace std; int main(int argc, char **argv) { char c = 0xFF; std::string sData = {c,c,c,c}; unsigned int i = (sData[0]&0xFF)*256 + (sData[1]&0xFF))*256) + (sData[2]&0xFF))*256 + (sData[3]&0xFF)); if (i != 0x) { // it is true why? // this print 18446744073709551615 wow std::cout << "WTF: " << i << std::endl; } return 0; } compiled with: g++ -O2 -Wall -o "test" "test.cxx" when compiled with -O0 it works as expected Vs. D: import std.stdio; void main(string[] args) { char c = 0xFF; string sData = [c,c,c,c]; uint i = (sData[0]&0xFF)*256 + (sData[1]&0xFF))*256) + (sData[2]&0xFF))*256 + (sData[3]&0xFF)); if (i != 0x) { // is false - make sense writefln("WTF: %d", i); } } compiled with: dmd -release -inline -boundscheck=off -w -of"test" "test.d" So it is code gen bug on c++ side, or there is something wrong with that code. As the C++ char are signed by default, when you accumulate several shifted 8 bit -1 into a char result and then store it in a 64 bit unsigned buffer, you get -1 in 64 bits : 18446744073709551615. That's not exactly what happens here. There's no 64 bit buffer. Sure about that ? ;) Yes, there are no "buffers" only register and a place on the stack for the variable i. As said it's undefined behaviour so anything goes. I just checked on godbolt what code is generated. https://godbolt.org/g/wxqfmM So with -O0 this happens: From line 41 to line 77 the instruction to make the calculation. At line 78 mov DWORD PTR [rbp-40], eax which is writing out 32 bits to reserved space of i. At line 85 mov eax, DWORD PTR [rbp-40] reloads that value in eax, this annuls the high part of RAX => RAX contains 0x___ On the -O2 version it's even simpler. The calculation is done at compile time and the endresult -1 is put directly to the output. The test is even removed. Everything happens in the compiler.
Re: Comparing D vs C++ (wierd behaviour of C++)
On Tuesday, 24 July 2018 at 19:39:10 UTC, Ecstatic Coder wrote: He gets different results with and without optimization because without optimization the result of the calculation is spilled to the i unsigned int and then reloaded for the print call. This save and reload truncated the value to its real value. In the optimized version, the compiler removed the spill and the overflowed value contained in the register is printed as is. Btw you are actually confirming what I said. if (i != 0x) ... In the optimized version, when the 64 bits "i" value is compared to a 32 bits constant, the test fails... Proof that the value is stored in a **64** bits register, not 32... We're nitpicking over vocabulary. For me buffer != register. Buffer is something in memory in my mental model (or is hardware like the store buffer between register and the cache) but never would I denominate a register as a buffer.
Re: Comparing D vs C++ (wierd behaviour of C++)
On Tuesday, 24 July 2018 at 20:59:22 UTC, Patrick Schluter wrote: On Tuesday, 24 July 2018 at 19:24:05 UTC, Ecstatic Coder wrote: On Tuesday, 24 July 2018 at 15:08:35 UTC, Patrick Schluter wrote: On Tuesday, 24 July 2018 at 14:41:17 UTC, Ecstatic Coder wrote: On Tuesday, 24 July 2018 at 14:08:26 UTC, Daniel Kozak wrote: [...] As the C++ char are signed by default, when you accumulate several shifted 8 bit -1 into a char result and then store it in a 64 bit unsigned buffer, you get -1 in 64 bits : 18446744073709551615. That's not exactly what happens here. There's no 64 bit buffer. Sure about that ? ;) Yes, there are no "buffers" only register and a place on the stack for the variable i. As said it's undefined behaviour so anything goes. I just checked on godbolt what code is generated. https://godbolt.org/g/wxqfmM So with -O0 this happens: From line 41 to line 77 the instruction to make the calculation. At line 78 mov DWORD PTR [rbp-40], eax which is writing out 32 bits to reserved space of i. At line 85 mov eax, DWORD PTR [rbp-40] reloads that value in eax, this annuls the high part of RAX => RAX contains 0x___ what I forgot to mention, for the compiler the type deduction for the >> operator is done with the i variable, so it chooses the right template with unsigned int. For the optimized code as the calculation is done during compilation and there is no spill to the variable the type deduction for the >> operator for cout is done with that internal promoted temporary value and it deduces it as long (funnily declaring i as volatile doesn't change that even if the value is spilled to the stack). On the -O2 version it's even simpler. The calculation is done at compile time and the endresult -1 is put directly to the output. The test is even removed. Everything happens in the compiler.
Re: Engine of forum
On Tuesday, 21 August 2018 at 06:53:18 UTC, Daniel N wrote: On Tuesday, 21 August 2018 at 03:42:21 UTC, Ali wrote: Many of those new comers who ask about the forum software .. they never stick, they dont complain, or question, or try to change for the better, they simply leave I think this is the best forum I have ever used, it's a big contributing factor to that I post here! I don't post every month praising the forum, I'm silently happy. But if we changed I would likely complain every month. Second that. The 2 big things this forum frontend has, is forcing to snip quotes (go look on realworldtech to see whole threads of quote galore of 400 lines where the answer is just one word) and speed. The thing that comments cannot be edited is also an advantage. This forces to put a little be more thought in them.
Re: Is @safe still a work-in-progress?
On Wednesday, 22 August 2018 at 04:49:15 UTC, Mike Franklin wrote: On Wednesday, 22 August 2018 at 04:23:52 UTC, Jonathan M Davis wrote: The reality of the matter is that the DIP system is a formal way to propose language changes in order to convince Walter and Andrei that those changes should be implemented, whereas if Walter or Andrei writes the DIP, they're already convinced. This isn't a democracy. Walter is the BDFL, and it's his call. So, I really don't think that it's hypocritical Walter and Andrei need to have their ideas vetted by the community, not in an effort to convince anyone, but for quality assurance, to ensure they're not overlooking something. It is hypocritical an arrogant to believe that only our ideas have flaws and require scrutiny. The formal DIP process was put in place after DIP1000. I would even daresay that the process was put in place because of the issue with DIP1000 (the rigorously checked DIP's are all >1000 for that reason).
Re: This thread on Hacker News terrifies me
On Sunday, 2 September 2018 at 04:21:44 UTC, Jonathan M Davis wrote: On Saturday, September 1, 2018 9:18:17 PM MDT Nick Sabalausky (Abscissa) via Digitalmars-d wrote: So honestly, I don't find it at all surprising when an application can't handle not being able to write to disk. Ideally, it _would_ handle it (even if it's simply by shutting down, because it can't handle not having enough disk space), but for most applications, it really is thought of like running out of memory. So, isn't tested for, and no attempt is made to make it sane. One reason why programs using stdio do fail with disk space errors is that they don't know that fclose() can be the function reporting it, not the fwrite()/fputs()/fprintf(). I can not count the number of times I saw things like that: FILE *fd = fopen(...,"w"); if(fwrite(buffer, length, 1)<1) { fine error handling fclose(fd); on disk fullness the fwrite might have accepted the data, but only the fclose() really flushed the data to disk, only detecting the lack of space at that moment. Honestly, for some of this stuff, I think that the only way that it's ever going to work sanely is if extreme failure conditions result in Errors or Exceptions being thrown, and the program being killed. Most code simply isn't ever going to be written to handle such situations, and a for a _lot_ of programs, they really can't continue without those resources - which is presumably, why the way D's GC works is to throw an OutOfMemoryError when it can't allocate anything. Anything C-based (and plenty of C++-based programs too) is going to have serious problems though thanks to the fact that C/C++ programs often use APIs where you have to check a return code, and if it's a function that never fails under normal conditions, most programs aren't going to check it. Even diligent programmers are bound to miss some of them. Indeed, since some of those error checks also differ from OS to OS, some cases might detect things in one setting but not in others. See my example above, on DOS or if setvbuf() was set to NULL it would not possibly happen as the fwrite() would always flush() the data to disk and the error condition would be catched nearly 99.% of times.
Re: This is why I don't use D.
On Wednesday, 5 September 2018 at 15:34:14 UTC, Jonathan M Davis wrote: On Wednesday, September 5, 2018 9:28:38 AM MDT H. S. Teoh via Digitalmars-d wrote: On Wed, Sep 05, 2018 at 09:18:24AM -0600, Jonathan M Davis via Digitalmars-d wrote: [...] > 3rd party libraries are usually the real problem if there is > one. They need to be maintained, and if something happens > that breaks them from one release to another, that can > prevent you from upgrading until it's fixed - which may or > may not be quick even if they're maintained. And if they're > not maintained, well, then that's a serious problem. Now, > that would be a big problem in pretty much any language, but > the greater rate of change in D does make it worse than it > would be in languages that change at a much more glacial > pace. [...] And that is why I think we should implement my idea of putting *all* dub packages on code.dlang.org into our CI infrastructure, and log all successes / failures to a database that can then be used to display the range of version(s) of compilers that successfully built each package on code.dlang.org. Then people can quickly see, at a glance, whether the package still works with the version of the compiler that they're using (usually the most recent release, but not always, so this information can be super useful in making decisions). Oh, I think that that's a good idea, and it should help with folks picking libraries to use, but it doesn't fix the fundamental problem that whoever wrote the library needs to continue to maintain it or pass it on to someone else to maintain it when they don't want to maintain it anymore, or anyone using it is going to be screwed. So, while your suggestion will definitely help with a piece of the problem, it doesn't solve the part that I was talking about. I have more radical proposition. Why not check regularly the maintainers of a library querying for feedback. If no response or negative response simply remove the package from the main list. Put it in a section for unmaintained projects. It would not change fundamentaly the state of the packages, but would at least rein in on too high expectations. OP would probably not reacted as badly as he did if he had known that the package he tried was unmaintained. This might reduce the number of packages available, but it is often much better to have fewer choices than chosing the the wrong one (or two or three).
Re: DIP25/DIP1000: My thoughts round 2
On Wednesday, 5 September 2018 at 01:06:47 UTC, Paul Backus wrote: On Tuesday, 4 September 2018 at 16:36:20 UTC, Nick Treleaven wrote: My syntax for parameters that may get aliased to another parameter is to write the parameter number that may escape it in its scope attribute: On Sunday, 2 September 2018 at 05:14:58 UTC, Chris M. wrote: void betty(ref scope int*'a r, scope int*'a p) // okay it's not pretty void betty(ref scope int* r, scope(1) int* p); p is documented as (possibly) escaped in parameter 1. Would using parameter names instead of numbers work? As an unfamiliar reader, it wouldn't be clear at all to me what `scope(1)` meant, but `scope(r) int* p` would at least suggest that there's some connection between `p` and `r`. It's indeed imho better as numbered parameters are a pita. Any change is annoying and fragile. I cannot count how often in C I had issues with annotations like __attribute__((nonnnul(5,9))) and __attribute__((format(printf, 3, 4))) when I had to change the parameters.
Re: This is why I don't use D.
On Thursday, 6 September 2018 at 12:33:21 UTC, Everlast wrote: On Wednesday, 5 September 2018 at 12:32:33 UTC, Andre Pany wrote: On Wednesday, 5 September 2018 at 06:47:00 UTC, Everlast wrote: [...] You showed as a painful issue in our eco system which we can work on, thank you. You do not need to work on this but do you have a proposal for a solution? What would you help (ranking according to last update, ...) Kind regards Andre The problem is that all projects should be maintained. The issue, besides the tooling which can only reduce the problem to manageable levels, is that projects go stale over time. This is obvious! You say though "But we can't maintain every package, it is too much work"... and that is the problem, not that it is too much work but there are too many packages. This is the result of allowing everyone to build their own kitchen sink instead of having some type of common base types. It's sort of like most things now... say cell phone batteries... everyone makes a different one to their liking and so it is a big mess to find replacements after a few years. See, suppose if there were only one package... and everyone maintained it. Then as people leave other people will come in in a continual basis and the package will always be maintained as long as people are using it. This is why D needs organization, which it has none. It needs structure so things work and last and it isn't a continual fight. It's like if someone doesn't take care of their car. Eventually it starts to break down and when they do shitty fixes it only buys them a little time before it breaks down again and again. The issue isn't the fixes nor the car but how they use the car and not maintain it properly. That is, it is their mindsets. Since D seems to be full of people with very little understanding how how to build a proper foundation for organization, D has little chance of surviving. As the car breaks down more and more it is just a matter of time before it ends up in the junk heap. It was a great car while it lasted though... That's what I have said elsewhere in the thread. Checking the maintainer of a package, if there's no feedback, put the package out of the main list and put it in a purgatory where it can get stale for itself. If a new maintainer appears for a specific package, it can be reinstated in the approved list when it works again. What annoys people is not that there are broken packages in the list, but that there is no way to know beforehand if one is choosing a reliable package or a hobby experiment gone wrong. This uncertainty is grating imo.
Re: Source changes should include date of change
On Saturday, 8 September 2018 at 12:36:01 UTC, Paul Backus wrote: On Saturday, 8 September 2018 at 11:29:15 UTC, Josphe Brigmo wrote: Um, I didn't say don't use Git! Your illogic is that you believe that one can have only one or the other when one can have both. Hence, you are excluding a completely valid addition. You think it is an alternative. You are wrong. Please think about the question before you answer next time so that you don't get in the habit of doing it. No one said that Git couldn't be used and telling me to use it is very arrogant of yourself. The fact of the matter is that dates in source code will help when git is not available and one only has the source code. Git does a better job of tracking history automatically than anyone could ever realistically do by hand. So not only would date comments be useless duplication of work, they'd be useless duplication of inferior quality to the original. It would be like keeping a horse at your house at all times, in case your car breaks down. Even if it's occasionally useful, it is not worth the constant maintenance costs of feeding the horse, cleaning the stable, etc. If your car breaks down, you find a way to get it fixed. If git isn't available to you, you find a way to make it available. Interactive programs like GitExtension show exactly the the date of each line with the git blame view. Visual Studio Code with the D extension also shows the commit info when hovering over the code. There are a lot of nice way to use the git info to get the date of the line. Dates in the comments are utterly useless. They are imo even counter productive. The information has not bearing with the actual code. There is no point in putting dates in the comments when the code is managed by git.
Re: Mobile is the new PC and AArch64 is the new x64
On Saturday, 15 September 2018 at 15:25:55 UTC, Joakim wrote: You've probably heard of the possibly apocryphal story of how Blackberry and Nokia engineers disassembled the first iPhone and dismissed it because it only got a day of battery life, while their devices lasted much longer. They thought the mainstream market would care about such battery life as much as their early adopters, but they were wrong. It they'd ask me they would have known. I was a very late adopter of mobile phone and got my first phone in 2000. It was the used Siemens E10D of my brother. It had maximum 1 day of battery life. Then I got at work an Alcatel then a Nokia with ludicrously long battery life, nearly 2 weeks. Result -> my Siemens was always charged properly and I was always reachable. With the others, they were always crapping out on me at the most inapropriate times. With the short battery life, you would never forget to put it on charge. With the long battery life, you would always wait till it's too late.
Re: phobo's std.file is completely broke!
On Monday, 17 September 2018 at 12:37:13 UTC, Temtaime wrote: On Sunday, 16 September 2018 at 22:49:26 UTC, Vladimir Panteleev wrote: To elaborate: On Sunday, 16 September 2018 at 22:40:45 UTC, Vladimir Panteleev wrote: If *YOU* are OK with the consequences of complexity, implement this in YOUR code, but do not enforce it upon others. This is much better done in user code anyway, because you only need to expand / normalize the path and prepend the prefix only once (root of the directory tree you're operating on), instead of once per directory call. We could add a `string longPath(string)` function in Phobos (no-op on POSIX, expands and prepends prefix on Windows). I believe I suggested the same in the bug report years ago when we discussed it. It is absolutely not acceptable behavior. Complain to Microsoft. The OS should not allow users to create or select paths that programs cannot operate on without jumping through crazy hoops. Microsoft could have solved this easily enough: extern(System) void AllowLongPaths(); Programs (or programming language runtimes) which can handle paths longer than MAX_PATH could call that function. It can also be used as a hint to the OS that file/directory selection dialogs, as you mentioned, are allowed to select paths longer than MAX_PATH. It's problem with phobos. It should be able handle all the paths whatever length they have, on all the platforms without noising the user. Even with performance penalty, but it should. No, that's completely nuts! A library, especially a standard library, should not introduce new limitations, but pampering over the limitations of the platform is not the right thing to do. If the platforms API is piling POS, there's nothing a sane library can do about. If your app writes to a FAT12 formatted floppy disk you don't expect the library to implement code to alleviate its limitation, like 8+3 filenames or fixed number of files in the root directory.
Re: Walter's Guide to Translating Code From One Language to Another
On Friday, 21 September 2018 at 06:24:14 UTC, Peter Alexander wrote: On Friday, 21 September 2018 at 06:00:33 UTC, Walter Bright wrote: I've learned this the hard way, and I've had to learn it several times because I am a slow learner. I've posted this before, and repeat it because bears repeating. I find this is a great procedure for any sort of large refactoring -- minimal changes at each step and ensure tests are passing after every change. Also, use a git commit for each logical change. If you discover a change that should have been put in a previous commit, use rebase --interactive to put it in right commit (of course that branch you're working on is purely local). Only when all the changes have been made can you decide to squash or not, or reorder them, or push them partially. TL;DR git can help to organize the refactoring, not be solely a recording device.
Re: Updating D beyond Unicode 2.0
On Monday, 24 September 2018 at 13:26:14 UTC, Steven Schveighoffer wrote: 2. There are no rules about what *encoding* is acceptable, it's implementation defined. So various compilers have different rules as to what will be accepted in the actual source code. In fact, I read somewhere that not even ASCII is guaranteed to be supported. Indeed. IBM mainframes have C compilers too but not ASCII. They code in EBCDIC. That's why for instance it's not portable to do things like if(c >= 'A' && c <= 'Z') printf("CAPITAL LETTER\n"); is not true in EBCDIC.
Re: OT: Bad translations
On Wednesday, 26 September 2018 at 02:12:07 UTC, Ali Çehreli wrote: On 09/24/2018 08:17 AM, 0xEAB wrote: > - Non-idiomatic translations of tech terms [2] This is something I had heard from a Digital Research programmer in early 90s: English message was something like "No memory left" and the German translation was "No memory on the left hand side" :) The K&R in German was of the same "quality". That happens when the translator is not an IT person himself.