Re: Want a function that determines a double or float given its 80-bit IEEE 754 SANE (big endian) representation
On Wednesday, 23 August 2023 at 03:24:49 UTC, z wrote: On Tuesday, 22 August 2023 at 22:38:23 UTC, dan wrote: Hi, I'm parsing some files, each containing (among other things) 10 bytes said to represent an IEEE 754 extended floating point number, in SANE (Standard Apple Numerical Environment) form, as SANE existed in the early 1990s (so, big endian). Note that the number actually stored will probably be a positive even integer less than 100,000, so a better format would have been to store a two-byte ushort rather than a 10-byte float. However the spec chose to have an encoded float there. I would like to have a function of the form public bool ubytes_to_double( ubytes[10] u, out double d ) { /* stuff */ } which would set d to the value encoded provided that the value is a number and is sane, and otherwise just return false. So my plan is just to do this: examine the first 2 bytes to check the sign and see how big the number is, and if it is reasonable, convert the remaining 8 bytes to a fractional part, perhaps ignoring the last 2 or 3 as not being significant. But --- it seems like this kind of task may be something that d already does, maybe with some constructor of a double or something. Thanks in advance for any suggestions. dan On 32bit x86 an endianness swap and pointer cast to `real` should be enough.(seems to be the same format but i could be wrong.) Else(afaik `real` on 64 bit x86 is just `double`?) you can always isolate sign mantissa and exponent to three isolated `double` values(cast from integer to `double`) and recalculate(`sign*mantissa*(2^^exponent)` according to wikipedia) the floating point number, since they mostly contain integers precision loss probably won't be a problem. Thank you z. My machine is 64-bit and is little-endian but the method you suggest actually gives the right answer in my case. More exactly, a real on my machine is 16-bytes (128 bits), quadruple precision, and it has a sign bit with 15 bits of exponent. But the 80-bit format also has a sign bit with 15 bits of exponent. So all i have to do is declare a real y, cast it to ubyte*, and copy the 10 ubytes from the file over its first 10 bytes (but backwards). Then the sign bit and exponent exactly match in position. (The remaining 6 ubytes are left in their initial state because they're way out to the least significant part of the number.) Now, for my final code, i'm not actually doing this because the size of real may be different on another machine, or the exponents may get different sizes due to a different layout, or some other problem. So i just do it by hand (although i'm ignoring the last 4 ubytes since for my usage, ultimately it gets boiled down to a 32-bit integer anyway). And "by hand" is pretty close to what you also mention in your mantissa*2^^exponent expression. Thanks again.
Re: parallel threads stalls until all thread batches are finished.
On Wednesday, 23 August 2023 at 13:03:36 UTC, Joe wrote: I use foreach(s; taskPool.parallel(files, numParallel)) { L(s); } // L(s) represents the work to be done. If you make for example that L function return “ok” in case file successfully downloaded, you can try to use TaskPool.amap. The other option - use std.concurrency probably.
parallel threads stalls until all thread batches are finished.
I use foreach(s; taskPool.parallel(files, numParallel)) { L(s); } // L(s) represents the work to be done. to download files from the internet. Everything works. The issue is this: the foreach will download 8 files at once. BUT it will not start the next batch of 8 *until* ALL of the previous 8 are done. It seems that taskPool.parallel will not immediately start a new thread once a task is done E.g., I get L(s1); L(s2); ... L(s8); --- // nothing below is executed until all L(s1) through L(s8) are finished. L(s9); L(s10); ... My expectation is that, say, when the first task is complete, say L(s4), that L(s9) is then executed. The reason why this causes me problems is that the downloaded files, which are cashed to a temporary file, stick around and do not free up space(think of it just as using memory) and this can cause some problems some of the time. Also, the point of parallel tasks is to allow paralleling but the way the code is working is that it starts the tasks in parallel but then essentially stalls the paralleling a large portion of the time. E.g., If there are a bunch of small downloads but one large one, then that one large download stalls the everything. E.g., say L(s5) is a very long download while all others are very quick. Then L(s5) will prevent downloading anything afterwards until it is finished(I'll get L(s1) through L(s8) but nothing else until L(s5) is finished). What's going on and how to reconcile?
Re: File size
On Tuesday, 22 August 2023 at 16:22:52 UTC, harakim wrote: On Monday, 21 August 2023 at 11:05:36 UTC, FeepingCreature wrote: Can you print some of the wrong sizes? D's DirEntry iteration code just calls `FindFirstFileW`/`FindNextFileW`, so this *shouldn't* be a D-specific issue, and it should be possible to reproduce this in C. Thanks for the suggestion. I was working on getting the list for you when I decided to first try and reproduce this on Linux. I was not able to do so. Then I opened the Linux File Explorer and went to one of the files. There were two files by that name, with names differing only by case. In windows, I only saw one, because Windows Explorer only supports one file with an identical case-insensitive name per directory. Unsurprisingly, that is also the one that was selected by getSize(filename). The underlying windows functions must ignore case as well and select the same way as Explorer (which makes sense). That explains why Windows Explorer reported the same size as getsize(name) in every case, while DirEntry.size would match for the file with the same case as windows recognized and not for the file with a different case. I was able to get into this state because I copied the files (merged directories) in Linux. It was interesting to look into. It seems everything is working as designed. It shouldn't be an issue for me going forward either as I move more and more towards Linux. That's hilarious! I'm happy you found it.