VBScript helper
MS Windows is shipped with Windows Script Host (WSH), for every single malware developer to be able to do everything once he finally forced you to double-click on a small text file. Now WSH works for D developer too by providing easy (but not very fast) access to system information (yes, and COM objects). See example in docs. For now it only reads objects from VBScript. Writing data back and calling functions on VBScript objects can be added if someone really needs it (just contact me). Sources: https://bitbucket.org/denis_sh/misc/src/tip/vbscripthelper.d Docs: http://deoma-cmd.ru/d/docs/misc/vbscripthelper.html -- Денис В. Шеломовский Denis V. Shelomovskij
Re: unzip parallel, 3x faster than 7zip
Am Sat, 07 Apr 2012 21:45:04 +0200 schrieb Jay Norwood j...@prismnet.com: So ... it looks like the defrag helps, as the 109 sec values are at the low end of the range I've seen previously. Still it is totally surprising to me that deleting files should take longer than creating the same files. Maybe the kernel caches writes, but synchronizes deletes? (So the seek times become apparent there, and not in the writes) Also check the file creation flags, maybe you can hint Windows to the final file size and they wont be fragmented?
Re: unzip parallel, 3x faster than 7zip
On Sunday, 8 April 2012 at 13:55:21 UTC, Marco Leise wrote: Maybe the kernel caches writes, but synchronizes deletes? (So the seek times become apparent there, and not in the writes) Also check the file creation flags, maybe you can hint Windows to the final file size and they wont be fragmented? My understanding is that a delete operation occurs after all the file handles associated with a file are closed, assuming there other handles were opened with file_share_delete. I believe otherwise you get an error from the attempt to delete. I'm doing some experiments with myFrag sortByName() and it indicates to me that there will be huge improvments in delete efficiency available on a hard drive if you can figure out some way to get the os to arrange the files and directories in LCNs in that byName order. Below are the delete time from win7 rmdir on the same 2GB folder with and without defrag using myFrag sortByName(). This is win7 rmdir following myFrag sortByName() defrag ... less than 7 seconds G:\cmd /v:on /c echo !TIME! rmdir /q /s tz echo !TIME! 9:06:33.79 9:06:40.47 This is the same rmdir without defrag of the folder. 2 minutes 14 secs. G:\cmd /v:on /c echo !TIME! rmdir /q /s tz echo !TIME! 14:34:09.06 14:36:23.36 This is all on win7 ntfs, and I have no idea if similar gains are available for linux. So, yes, whatever tricks you can play with the win api in order to get it to organize the unzipped archive into this particular order is going to make huge improvements in the speed of delete.
Re: DMagick image processing with D.
Thanks for DMagick... works great, except toBlob(). When I try the following: Image example = new Image(Geometry(100, 100), new ColorRGB(0, 255, 0)); example.toBlob(); I get an access violation. Do I anything wrong or is it a bug? (I'm using Imagick 6.7.6, DMD64 and CentOS)
Re: I'll be in Seattle at Lang.NEXT
Andrei Alexandrescu: Slides are online: http://channel9.msdn.com/Events/Lang-NEXT/Lang-NEXT-2012/Three-Unlikely-Successful-Features-of-D Putting the slides online before the talk is a very good idea, thank you. Page 31: the title of this slide is D array = pointer + length, but the image shows two pointers inside the array struct/fat pointer. Walter has said several times his desire to replace the pointer + length with two pointers. Are those desires going to produce a change? And even if this is a bit OT: why aren't D array fat references composed by 3 fields: pointer + length + capacity? I think Go slices are like this. Page 34, Convenient: I don't know how well DMD will optimize this code, but it's one of the simplest to read array-twiddling palindrome functions I've seen. But probably I write: !a.empty Instead of: a.length Page 36, Palindrome generalized: unfortunately D doesn't map syntaxes like a[1..$-1] to range functions :-) Page 51: An horizontal line needs to be at 1.0 too. But I prefer a graph that shows run-time seconds. Very nice slides pack. Bye, bearophile
Re: I'll be in Seattle at Lang.NEXT
On 4/8/12 11:31 AM, bearophile wrote: Andrei Alexandrescu: Slides are online: http://channel9.msdn.com/Events/Lang-NEXT/Lang-NEXT-2012/Three-Unlikely-Successful-Features-of-D Putting the slides online before the talk is a very good idea, thank you. Page 31: the title of this slide is D array = pointer + length, but the image shows two pointers inside the array struct/fat pointer. I mention during the talk that the concept is the same regardless of that representation detail. Andrei
code to get LCN from filename
I hacked up one of the file.d functions to create a function that returns the first Logical Cluster Number for a regular file. I've tested it on the 2GB layout that has been defragged with the myDefrag sortByName() operation, and it works as expected. Values of 0 mean the file was small enough to fit in the MFT. The LCN numbers would be a good thing to sort by before doing accesses on entries coming from any large directory operations ... for example zip, copy, delete of directories. enum { FILE_DEVICE_FILE_SYSTEM = 9, METHOD_NEITHER = 3, FILE_ANY_ACCESS = 0 } uint CTL_CODE(uint t, uint f, uint m, uint a) { return (t 16) | (a 14) | (f 2) | m; } const FSCTL_GET_RETRIEVAL_POINTERS = CTL_CODE(FILE_DEVICE_FILE_SYSTEM,28,METHOD_NEITHER,FILE_ANY_ACCESS); /* extern (Windows) int DeviceIoControl(void *, uint, void *, uint, void *, uint, uint *, _OVERLAPPED *); from WinIoCtl.h in SDK #define FSCTL_GET_RETRIEVAL_POINTERS CTL_CODE(FILE_DEVICE_FILE_SYSTEM, 28, METHOD_NEITHER, FILE_ANY_ACCESS) // STARTING_VCN_INPUT_BUFFER, RETRIEVAL_POINTERS_BUFFER */ struct RETRIEVAL_POINTERS_BUFFER{ } ulong getStartLCN (in char[] name) { int[] buffer = [ 0 ]; version(Windows) { alias TypeTuple!(GENERIC_READ, FILE_SHARE_READ, null, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, HANDLE.init) defaults; auto h = useWfuncs ? CreateFileW(std.utf.toUTF16z(name), defaults) : CreateFileA(toMBSz(name), defaults); cenforce(h != INVALID_HANDLE_VALUE, name); scope(exit) cenforce(CloseHandle(h), name); alias long LARGE_INTEGER ; struct STARTING_VCN_INPUT_BUFFER { LARGE_INTEGER StartingVcn; } STARTING_VCN_INPUT_BUFFER inputVcn; inputVcn.StartingVcn = 0; struct RPExtents{ LARGE_INTEGER NextVcn; LARGE_INTEGER Lcn; } struct RETRIEVAL_POINTERS_BUFFER { DWORD ExtentCount; LARGE_INTEGER StartingVcn; RPExtents rpExtents[1]; } RETRIEVAL_POINTERS_BUFFER rpBuf; DWORD numBytes; //expect only a partial return of one rpExtent DeviceIoControl( h, FSCTL_GET_RETRIEVAL_POINTERS, cast(void*)inputVcn, inputVcn.sizeof, cast(void*)rpBuf, rpBuf.sizeof, numBytes, null ); return cast(ulong)rpBuf.rpExtents[0].Lcn; } else version(Posix) return 0; // not implemented }
Re: Modern COM Programming in D
On Tuesday, 3 April 2012 at 14:10:32 UTC, Jesse Phillips wrote: On Tuesday, 3 April 2012 at 07:49:28 UTC, Sam Hu wrote: Sorry the link http://dpxml-lio/d is unreachable from my side (maybe someone blocked it :P).Could you please provide an alternative place for download? Appreciated. Regards, Sam Most of his code isn't available as it was kind of under Microsoft. However I revived Juno for D2 awhile ago (still need to play with it myself). Juno provides some nice tools and API. https://github.com/JesseKPhillips/Juno-Windows-Class-Library http://dsource.org/projects/juno Thanks for the reply.Yes,Juno is great,I use it since D1. Regards, Sam
Re: custom attribute proposal (yeah, another one)
After reading the thread my vote goes to the struct proposal. The two approaches functions vs. structs are functionally equivalent but conceptually structs are preferable. Attributes are meta _data_ which is conceptually associated with types whereas functions are conceptually associated with behaviour. simply put - structs are the more intuitive choice. There are valid concerns raised about the implementation - code bloat, struct default ctor, etc. those are implementation concerns that should be handled in the compiler and not in the language design.
Re: Small Buffer Optimization for string and friends
Andrei Alexandrescu seewebsiteforem...@erdani.org wrote in message news:jlr9ak$28bv$1...@digitalmars.com... Walter and I discussed today about using the small string optimization in string and other arrays of immutable small objects. On 64 bit machines, string occupies 16 bytes. We could use the first byte as discriminator, which means that all strings under 16 chars need no memory allocation at all. It turns out statistically a lot of strings are small. According to a variety of systems we use at Facebook, the small buffer optimization is king - it just works great in all cases. In D that means better speed, better locality, and less garbage. This sounds like it would be a great addition to phobos. For this to happen, we need to start an effort of migrating built-in arrays into runtime, essentially making them templates that the compiler lowers to. - This has been a disaster for AAs - Is it worth doing for 32 bit? - Would generate false pointers - Run-time check on every array access? - Why should this be in the language/compiler instead of phobos? April 1st was last week!!
Re: Precise GC
On Apr 7, 2012, at 6:56 PM, Walter Bright newshou...@digitalmars.com wrote: Of course, many of us have been thinking about this for a looong time, and what is the best way to go about it. The usual technique is for the compiler to emit some sort of table for each TypeInfo giving the layout of the object, i.e. where the pointers are. The general problem with these is the table is non-trivial, as it will require things like iterated data blocks, etc. It has to be compressed to save space, and the gc then has to execute a fair amount of code to decode it. It also requires some significant work on the compiler end, leading of course to complexity, rigidity, development bottlenecks, and the usual bugs. An alternative Andrei and I have been talking about is to put in the TypeInfo a pointer to a function. That function will contain customized code to mark the pointers in an instance of that type. That custom code will be generated by a template defined by the library. All the compiler has to do is stupidly instantiate the template for the type, and insert an address to the generated function. The compiler need know NOTHING about how the marking works. Even better, as ctRegex has demonstrated, the custom generated code can be very, very fast compared with a runtime table-driven approach. (The slow part will be calling the function indirectly.) And best of all, the design is pushed out of the compiler into the library, so various schemes can be tried out without needing compiler work. I think this is an exciting idea, it will enable us to get a precise gc by enabling people to work on it in parallel rather than serially waiting for me. With __traits and such, I kind of always figured we'd go this way. There's simply no reason to have the compiler generate a map. Glad to see it's working out.
Re: std.benchmark ready for review. Manager sought after
On Sunday, 8 April 2012 at 05:41:11 UTC, Andrei Alexandrescu wrote: 3) benchmark_relative_file read should be replaced with a language construct. E.g. a function call like relativeBenchmark(file read), or an enum value like getopt's. No can do. Need a function name-based convention so we can automate scheduleForBenchmarking. Hmm, maybe there's a misunderstanding, but what I meant was: The benchmark_relative_ prefix makes sense for function names (for scheduleForBenchmarking), but not so much for string literals for benchmark names. The string literal benchmark_relative_file read looks like the words benchmark relative file are grouped together, with read added on. So, my suggestion would be to wrap the benchmark_relative_ prefix - when used with benchmark name strings - into a semantical function / enum / etc. In my example above, relativeBenchmark would be: string relativeBenchmark(string s) { return benchmark_relative_ ~ s; } I suppose it can be summed up as a tradeoff between complexity (you need to explain both the function name usage and the relativeBenchmark wrapper usage) vs. code prettiness.
Re: Custom attributes (again)
I don't want this thread to disappear. The ideas presented here have common basic features among the nice-to-haves. 1. Attributes add meta data baggage to a symbol 2. This meta data is thought of as a read-only hash (has, get, iterate) 3. Can be queried at compile-time 4. The syntax is concise (i.e. improves over implementing attributes 'manually' with mixins) Now the compiler has solved things (what is a symbol, an AST, ...) in one specific way and to keep it stable and the amount of work within bounds, any implementation details of attributes that make invasive changes necessary should be postponed. I don't know the compiler, so it is just my gut feeling when I say that annotating local variables could be a refinement to 1) that doesn't work well. You get the point. With that in mind it would be good to know exactly what can simply be tacked onto the front-end (rather than refactoring a complex part of it). What do you want to hear? I hear you ask. Ok, here is my shameless since D is supposed to be argument: D is supposed to be pragmatic, so I would start with a collection of use cases. Really, I want attributes to be like this is not enough. The use case should be stated in a fashion, that programmers without experience in the field can follow. The standard use cases for attributes are must-haves, the rest can be nice-to-haves. Add to the list, what you need from the attribute system: ** Serialization/RPC ** A relational SQL database comes with meta information on data types. We want to annotate D types with the corresponding types in the DB. This can be used to validate the value ranges at runtime, generate the correct SQL to work with the tables or even create tables that don't exist already. It is also common to establish relations between tables. The struct Parent may have a 'Child*[]' field and Child a 'Parent*' field. The same applies to RPC. For example we may want to return a bool[] using a special case in the RPC system for bit arrays: '@Rpc(type = RpcType.native_bit_array, mode = RpcMode.async) bool[] foo() { … }'. This annotation applies to the symbol foo. If make this a more complex return type, it becomes this: enum RpcMode …; enum RpcType …; struct Rpc …; struct RpcStruct …; // has a single field, syntax options: // @RpcTypeMap(type = RpcType.int) // @RpcTypeMap(RpcType.int) // @RpcTypeMap(int) ? struct RpcTypeMod { RpcType type }; @RpcStruct struct MyRet { string name; @RpcTypeMap(RpcType.native_bit_array) bool[] flags; } @Rpc(mode = RpcMode.async) MyRet foo() { … } For RPC it is also necessary, to annotate parameters in the same way: void foo(@RpcTypeMap(RpcType.native_bit_array) bool[] flags) { … } ** Edit object properties in a GUI (as suggested by Manu) ** As seen in GUI builders, often the need occurs to generate bindings to data structures in order to edit their properties in a convenient user interface. A uint field may be edited using a RGB+Alpha color selector and serialized into a data file. The requirements for the annotations are the same as above. (I know that RTTI would make life easier here, but it isn't a show stopper.) For these to work it would require: - user annotations to functions/methods/structs/classes - only CTFE support (as annotations don't change at runtime) - no influence on language semantics And I agree with others that it is a good idea to implement annotations as structured types (POD structs at least) to avoid spelling mistakes and encourage IDE support. Just as an idea: Such structs, could contain their own logic. So some annotations which work stand-alone could validate themselves (invariant()?), print debug msgs, write binding definitions to text files (if CTFE I/O happens) or actually mixin code if used on structs/classes. But that's just brain-storming to give an idea why annotations as key/value pairs could be unflexible. Generally C# and Java annotations have both been a success, so they are both what people mostly expect and a good 'template' for D aside from runtim vs. compile-time issues. thanks for reading :) -- Marco
Re: a pretty exciting result for parallel D lang rmd following defrag by name
On Sunday, 8 April 2012 at 01:18:49 UTC, Jay Norwood wrote: in it. Same 3.7 second delete. I'll have to analyze what is happening, but this is a huge improvement. If it is just the sequential LCN order of the operations, it may be that I can just pre-sort the delete operations by the file lcn number and get similar results. I ran rmd in the debugger to look at the order of entries being returned from the depth first search. The directory entry list returned is sorted alphabetically the same whether or not the sortByName() defrag script has been executed. This article confirms that directory entries are sorted alphabetically. http://msdn.microsoft.com/en-us/library/ms995846.aspx Directory entries are sorted alphabetically, which explains why NTFS files are always printed alphabetically in directory listings. I'll have to write something to dump the starting lcn for each directory entry and see if the sortByName defrag is matching the DirEntries list exactly.
Re: std.benchmark ready for review. Manager sought after
Le 08/04/2012 05:25, Andrei Alexandrescu a écrit : Hello, I finally found the time to complete std.benchmark. I got to a very simple API design, starting where I like it: one line of code. Code is in the form of a pull request at https://github.com/D-Programming-Language/phobos/pull/529. (There's some noise in there caused by my git n00biness). Documentation is at http://erdani.com/d/web/phobos-prerelease/std_benchmark.html. If reasonable and at all possible, I'd like to bump the priority of this proposal. Clearly D's user base is highly interested in efficiency, and many of the upcoming libraries have efficiency a virtual part of their design. So we should get std.benchmark in soon and require that new addition come with benchmarks for their essential functionality. My vision is that in the future Phobos will have a significant benchmarks battery, which will help improving Phobos and porting to new platforms. Andrei Like it. Would it be a good idea to add a column with an average memory used ?
Re: Precise GC
On 04/08/2012 03:56 AM, Walter Bright wrote: Of course, many of us have been thinking about this for a looong time, and what is the best way to go about it. The usual technique is for the compiler to emit some sort of table for each TypeInfo giving the layout of the object, i.e. where the pointers are. The general problem with these is the table is non-trivial, as it will require things like iterated data blocks, etc. It has to be compressed to save space, and the gc then has to execute a fair amount of code to decode it. It also requires some significant work on the compiler end, leading of course to complexity, rigidity, development bottlenecks, and the usual bugs. An alternative Andrei and I have been talking about is to put in the TypeInfo a pointer to a function. That function will contain customized code to mark the pointers in an instance of that type. That custom code will be generated by a template defined by the library. All the compiler has to do is stupidly instantiate the template for the type, and insert an address to the generated function. The compiler need know NOTHING about how the marking works. Even better, as ctRegex has demonstrated, the custom generated code can be very, very fast compared with a runtime table-driven approach. (The slow part will be calling the function indirectly.) And best of all, the design is pushed out of the compiler into the library, so various schemes can be tried out without needing compiler work. I think this is an exciting idea, it will enable us to get a precise gc by enabling people to work on it in parallel rather than serially waiting for me. That actually sounds like a pretty awesome idea.
Shared library in D on Linux
I am still testing which setup gives me reliable shared D libraries which can be used from C. Here is my latest test: * test.d: import std.stdio; extern (C) { void hiD() { writeln(hi from D lib); } } * main.c #include stdio.h #include dlfcn.h #include stdlib.h void main() { void (*hiD)(void); void* handle = dlopen(./libtest.so, RTLD_LAZY); if (handle == NULL) { printf(%s\n, dlerror()); exit(1); } hiD = dlsym(handle, hiD); if (hiD != NULL) { hiD(); } else { printf(hiD is null\n); } dlclose(handle); } * Makefile #!/bin/bash test: #gdc-4.6 -g -c test.d -fPIC -o test.o #gdc-4.6 -shared -o libtest.so -fPIC test.o -lc -nostartfiles dmd -g -c test.d -fPIC ld -shared -o libtest.so test.o -lrt -lphobos2 -lpthread gcc -g main.c -ldl -lpthread ./a.out clean: rm -rf *.so *.o *.out With this setup I get ./libtest.so: undefined symbol: _deh_beg With a fake main method added I get make: *** [test] Segmentation fault This is what I get from gdb Program received signal SIGSEGV, Segmentation fault. 0xb7fd1ed3 in std.stdio.__T7writelnTAyaZ.writeln() (_param_0=...) at /usr/include/d/dmd/phobos/std/stdio.d:1550 1550 enforce(fprintf(.stdout.p.handle, %.*s\n, Event with the following startup hooks extern(C) { void gc_init(); void gc_term(); void _init() { gc_init(); } void _fini() { gc_term(); } } I get the same error. I am using dmd 2.058 on Ubuntu 11.10 (32 bit) With gdc I get different errors, but it seems even more difficult to get it working. Does anyone know what is missing to get proper shared library support working on Linux?
Re: Precise GC
On 04/08/2012 10:45 AM, Timon Gehr wrote: That actually sounds like a pretty awesome idea. Make sure that the compiler does not actually rely on the fact that the template generates a function. The design should include the possibility of just generating tables. It all should be completely transparent to the compiler, if that is possible.
Re: D projects list
Am Fri, 6 Apr 2012 14:52:38 -0400 schrieb Nick Sabalausky a@a.a: H. S. Teoh hst...@quickfur.ath.cx wrote in message news:mailman.1417.1333721195.4860.digitalmar...@puremagic.com... On Fri, Apr 06, 2012 at 12:34:09PM +0400, Denis Shelomovskij wrote: And adobe Flash of course should also die. +1. It should have died a DECADE ago. Except that certain interests kept its decaying worm-infested corpse animating even till today. Funny, that's also how I feel about C++. As I've been saying for awhile, a decade of near-zero interest in anything but VM languages is what kept it on life support. Fortunately, D's quickly becoming the successor that's always been needed so C++ will finally be able to RIP. Hehe, that might work for your own projects, but realistically look at how many more people are knowledgeable about C++, companies have big projects written in C++, desktop environments are written in C++, many tools (internal, commercial, free) exist for C++, it is a stable target and so on. D is still adding features that make it interesting for certain audiences: SSE intrinsics for the Manus of the world (game devs), annotations for GUI/ORM/RPC bindings, short lambda syntax. Not every feature of D is subjectively better than what is available in C++, but I assume most C++ devs miss _something_ that D offers or will offer (in case of the shared implementation). -- Marco
Re: Issue with module destructor order
Created bug ticket: http://d.puremagic.com/issues/show_bug.cgi?id=7855 -- Kind Regards Benjamin Thaut
readonly storage class
While typing D code I usually come across the problem that neither const nor immutable describe the usage pattern of the memory I'm currently working on 100%. Sometimes I have immutable data that has been shared among threads that I want to pass to a function. Then I have some const data that I want to pass to the same function. Currently you don't have any other choice but to write that function two times. But the function itself does not need the extended properties of const or immutable: const: can be casted back to mutable immutable: can be implicitly shared among threads The only thing the function cares about is, that it will not change the data passed to it. It would be kind of nice to have a thrid storage class readonly. It can not be casted back to mutable and it can not be implicitly shared among threads, but both const and immutable implicitly convert to readonly, because both of these storage classes lose one of their properties during conversion. That way you only have to write the function once and can pass both const and immutable data to it. Just an idea, comments and critics welcome. -- Kind Regards Benjamin Thaut
Re: Discussion on Go and D
On 4/7/12, Andrei Alexandrescu seewebsiteforem...@erdani.org wrote: 3. The ability to dispose of memory will disappear along with the delete keyword. Pull this and hopefully that myth will come to an end: https://github.com/D-Programming-Language/d-programming-language.org/pull/112
Re: Precise GC
On 04/08/2012 10:45 AM, Timon Gehr wrote: That actually sounds like a pretty awesome idea. I understand that the stack will still have to be scanned conservatively, but how does the scheme deal with closures?
Re: a pretty exciting result for parallel D lang rmd following defrag by name
Le 08/04/2012 09:34, Jay Norwood a écrit : On Sunday, 8 April 2012 at 01:18:49 UTC, Jay Norwood wrote: in it. Same 3.7 second delete. I'll have to analyze what is happening, but this is a huge improvement. If it is just the sequential LCN order of the operations, it may be that I can just pre-sort the delete operations by the file lcn number and get similar results. I ran rmd in the debugger to look at the order of entries being returned from the depth first search. The directory entry list returned is sorted alphabetically the same whether or not the sortByName() defrag script has been executed. This article confirms that directory entries are sorted alphabetically. http://msdn.microsoft.com/en-us/library/ms995846.aspx Directory entries are sorted alphabetically, which explains why NTFS files are always printed alphabetically in directory listings. I'll have to write something to dump the starting lcn for each directory entry and see if the sortByName defrag is matching the DirEntries list exactly. Hi, You seem to have done a pretty good job with your parallel unzip. Have you tried a parallel zip as well ? Do you think you could include this in std.zip when you're done ?
Re: readonly storage class
On Sunday, April 08, 2012 11:16:40 Benjamin Thaut wrote: While typing D code I usually come across the problem that neither const nor immutable describe the usage pattern of the memory I'm currently working on 100%. Sometimes I have immutable data that has been shared among threads that I want to pass to a function. Then I have some const data that I want to pass to the same function. Currently you don't have any other choice but to write that function two times. But the function itself does not need the extended properties of const or immutable: const: can be casted back to mutable immutable: can be implicitly shared among threads The only thing the function cares about is, that it will not change the data passed to it. It would be kind of nice to have a thrid storage class readonly. It can not be casted back to mutable and it can not be implicitly shared among threads, but both const and immutable implicitly convert to readonly, because both of these storage classes lose one of their properties during conversion. That way you only have to write the function once and can pass both const and immutable data to it. Just an idea, comments and critics welcome. I would point out that casting const to mutable and then altering the variable is subverting the type system. The compiler does not support casting away either const or immutable to alter _anything_. So, as far as the type system is concerned, if you want a function that takes both const and immutable, it should take const. Now, you _can_ cast away const and alter a variable if you're careful, but you're subverting the type system when you do so and throwing away any guarantees that the compiler gives you. It's far from safe. Given that casting away const on a variable and then mutating is subverting the type system and thate therefore the compiler is free to assume that you will never do it, I don't see what your idea of readonly would buy us. It's the same as const. - Jonathan M Davis
Re: readonly storage class
On 04/08/2012 11:16 AM, Benjamin Thaut wrote: While typing D code I usually come across the problem that neither const nor immutable describe the usage pattern of the memory I'm currently working on 100%. Sometimes I have immutable data that has been shared among threads that I want to pass to a function. Then I have some const data that I want to pass to the same function. Currently you don't have any other choice but to write that function two times. But the function itself does not need the extended properties of const or immutable: const: can be casted back to mutable It cannot. immutable: can be implicitly shared among threads The only thing the function cares about is, that it will not change the data passed to it. It would be kind of nice to have a thrid storage class readonly. It can not be casted back to mutable and it can not be implicitly shared among threads, but both const and immutable implicitly convert to readonly, because both of these storage classes lose one of their properties during conversion. That way you only have to write the function once and can pass both const and immutable data to it. Just an idea, comments and critics welcome. I don't get the problem. Can you demonstrate the issue with an example?
Re: Small Buffer Optimization for string and friends
On 4/8/12, Andrei Alexandrescu seewebsiteforem...@erdani.org wrote: essentially making them templates I just hope that doesn't cause: 1) Awful template errors 2) Slower build times 3) More ICEs
Re: Precise GC
On 4/8/2012 2:21 AM, Timon Gehr wrote: On 04/08/2012 10:45 AM, Timon Gehr wrote: That actually sounds like a pretty awesome idea. I understand that the stack will still have to be scanned conservatively, but how does the scheme deal with closures? For now, just treat them conservatively.
Re: readonly storage class
Am 08.04.2012 11:28, schrieb Jonathan M Davis: On Sunday, April 08, 2012 11:16:40 Benjamin Thaut wrote: While typing D code I usually come across the problem that neither const nor immutable describe the usage pattern of the memory I'm currently working on 100%. Sometimes I have immutable data that has been shared among threads that I want to pass to a function. Then I have some const data that I want to pass to the same function. Currently you don't have any other choice but to write that function two times. But the function itself does not need the extended properties of const or immutable: const: can be casted back to mutable immutable: can be implicitly shared among threads The only thing the function cares about is, that it will not change the data passed to it. It would be kind of nice to have a thrid storage class readonly. It can not be casted back to mutable and it can not be implicitly shared among threads, but both const and immutable implicitly convert to readonly, because both of these storage classes lose one of their properties during conversion. That way you only have to write the function once and can pass both const and immutable data to it. Just an idea, comments and critics welcome. I would point out that casting const to mutable and then altering the variable is subverting the type system. The compiler does not support casting away either const or immutable to alter _anything_. So, as far as the type system is concerned, if you want a function that takes both const and immutable, it should take const. Now, you _can_ cast away const and alter a variable if you're careful, but you're subverting the type system when you do so and throwing away any guarantees that the compiler gives you. It's far from safe. Given that casting away const on a variable and then mutating is subverting the type system and thate therefore the compiler is free to assume that you will never do it, I don't see what your idea of readonly would buy us. It's the same as const. - Jonathan M Davis I'll come up with a example. But if what you say is true, why is immutable not implicitly convertible to const? -- Kind Regards Benjamin Thaut
Re: Precise GC
On 8 April 2012 11:56, Timon Gehr timon.g...@gmx.ch wrote: On 04/08/2012 10:45 AM, Timon Gehr wrote: That actually sounds like a pretty awesome idea. Make sure that the compiler does not actually rely on the fact that the template generates a function. The design should include the possibility of just generating tables. It all should be completely transparent to the compiler, if that is possible. This sounds important to me. If it is also possible to do the work with generated tables, and not calling thousands of indirect functions in someone's implementation, it would be nice to reserve that possibility. Indirect function calls in hot loops make me very nervous for non-x86 machines.
Re: readonly storage class
On Sunday, April 08, 2012 11:39:03 Benjamin Thaut wrote: I'll come up with a example. But if what you say is true, why is immutable not implicitly convertible to const? It _is_ implictly convertible to const. - Jonathan M Davis
Re: Small Buffer Optimization for string and friends
On Sunday, 8 April 2012 at 05:56:36 UTC, Andrei Alexandrescu wrote: Walter and I discussed today about using the small string optimization in string and other arrays of immutable small objects. On 64 bit machines, string occupies 16 bytes. We could use the first byte as discriminator, which means that all strings under 16 chars need no memory allocation at all. Don't use the first byte. Use the last byte. The last byte is the highest-order byte of the length. Limiting arrays to 18.37 exabytes, as opposed to 18.45 exabytes, is a much nicer limitation than making assumptions about the memory layout.
Re: readonly storage class
On Sun, 08 Apr 2012 11:39:03 +0200, Benjamin Thaut c...@benjamin-thaut.de wrote: Am 08.04.2012 11:28, schrieb Jonathan M Davis: On Sunday, April 08, 2012 11:16:40 Benjamin Thaut wrote: While typing D code I usually come across the problem that neither const nor immutable describe the usage pattern of the memory I'm currently working on 100%. Sometimes I have immutable data that has been shared among threads that I want to pass to a function. Then I have some const data that I want to pass to the same function. Currently you don't have any other choice but to write that function two times. But the function itself does not need the extended properties of const or immutable: const: can be casted back to mutable immutable: can be implicitly shared among threads The only thing the function cares about is, that it will not change the data passed to it. It would be kind of nice to have a thrid storage class readonly. It can not be casted back to mutable and it can not be implicitly shared among threads, but both const and immutable implicitly convert to readonly, because both of these storage classes lose one of their properties during conversion. That way you only have to write the function once and can pass both const and immutable data to it. Just an idea, comments and critics welcome. I would point out that casting const to mutable and then altering the variable is subverting the type system. The compiler does not support casting away either const or immutable to alter _anything_. So, as far as the type system is concerned, if you want a function that takes both const and immutable, it should take const. Now, you _can_ cast away const and alter a variable if you're careful, but you're subverting the type system when you do so and throwing away any guarantees that the compiler gives you. It's far from safe. Given that casting away const on a variable and then mutating is subverting the type system and thate therefore the compiler is free to assume that you will never do it, I don't see what your idea of readonly would buy us. It's the same as const. - Jonathan M Davis I'll come up with a example. But if what you say is true, why is immutable not implicitly convertible to const? It is.
Re: Discussion on Go and D
On 4/8/2012 12:04 AM, Timon Gehr wrote: On 04/07/2012 04:43 PM, Rainer Schuetze wrote: On 4/7/2012 8:24 AM, Dmitry Olshansky wrote: On 07.04.2012 2:08, Rainer Schuetze wrote: On 4/6/2012 8:01 PM, Walter Bright wrote: On 4/6/2012 10:37 AM, Rainer Schuetze wrote: I hope there is something wrong with my reasoning, and that you could give me some hints to avoid the memory bloat and the application stalls. A couple of things you can try (they are workarounds, not solutions): 1. Actively delete memory you no longer need, rather than relying on the gc to catch it. Yes, this is as unsafe as using C's free(). Actually, having to deal with lifetime issues myself takes away the biggest plus of the GC, so I am a bit reluctant to do this. How about this: http://blog.thecybershadow.net/2010/07/15/data-d-unmanaged-memory-wrapper-for-d/ Or you can wrap-up something similar along the same lines. Thanks for your and other's hints on reducing garbage collected memory, but I find it hard to isolate larger blocks of memory for manual management. Most of the structs and classes are rather small. As you apparently just re-parse the whole source and throw the old AST away, wouldn't it be rather simple? You could just create a region allocator and free all the memory at once after the re-parse. If you only use the syntax error highlighting feature that might work. As soon as semantic analysis kicks in modifications on a copy of the parse tree are done lazily, and as symbols get resolved, references into other trees are remembered. It can get rather involved to figure out when it is safe to delete a memory block manually. The region allocator has the advantage that it does not need the alignment to the power of 2 for each allocation, so memory could be saved, too. I guess, I'll try something along the line with some additional functions to avoid bad references after deletion. I'm rather unhappy to sell D with the hint Go back to manual memory management if you need more than 64MB of memory and want your application to be responsive. I think it is actually awesome that manual memory management is possible.
Re: Small Buffer Optimization for string and friends
On 8 April 2012 12:46, Vladimir Panteleev vladi...@thecybershadow.netwrote: On Sunday, 8 April 2012 at 05:56:36 UTC, Andrei Alexandrescu wrote: Walter and I discussed today about using the small string optimization in string and other arrays of immutable small objects. On 64 bit machines, string occupies 16 bytes. We could use the first byte as discriminator, which means that all strings under 16 chars need no memory allocation at all. Don't use the first byte. Use the last byte. The last byte is the highest-order byte of the length. Limiting arrays to 18.37 exabytes, as opposed to 18.45 exabytes, is a much nicer limitation than making assumptions about the memory layout. What is the plan for 32bit?
Re: Small Buffer Optimization for string and friends
On Sunday, 8 April 2012 at 09:46:28 UTC, Vladimir Panteleev wrote: On Sunday, 8 April 2012 at 05:56:36 UTC, Andrei Alexandrescu wrote: Walter and I discussed today about using the small string optimization in string and other arrays of immutable small objects. On 64 bit machines, string occupies 16 bytes. We could use the first byte as discriminator, which means that all strings under 16 chars need no memory allocation at all. Don't use the first byte. Use the last byte. The last byte is the highest-order byte of the length. Limiting arrays to 18.37 exabytes, as opposed to 18.45 exabytes, is a much nicer limitation than making assumptions about the memory layout. Erm... never mind, I thought the pointer was the first field. Even so, how would you use the lowest-order byte of the length as a discriminator? Unless you allocated bytes 2-8 for the length (which would be an unaligned read, or a shift every time...) making assumptions about the memory layout now seems like the only solution. Also, what will be .ptr for such arrays? It can't point inside the string, because the type is immutable and the array is on the stack. Duplication can be expensive as well.
Re: Precise GC
On 4/8/2012 11:21 AM, Timon Gehr wrote: On 04/08/2012 10:45 AM, Timon Gehr wrote: That actually sounds like a pretty awesome idea. I understand that the stack will still have to be scanned conservatively, but how does the scheme deal with closures? I guess the compiler should generate an (anonymous) struct type corresponding to the closure data layout. There probably has to be a template for compiler generated structs or classes anyway. This new type could also be used as the type of the context pointer, so a debugger could display the closure variables.
Re: Custom attributes (again)
On 2012-04-08 09:27, Marco Leise wrote: I don't want this thread to disappear. The ideas presented here have common basic features among the nice-to-haves. For these to work it would require: - user annotations to functions/methods/structs/classes - only CTFE support (as annotations don't change at runtime) I don't see why the attributes should be accessible at runtime. Even if they're read-only it's still good to be able to read the attributes at runtime. -- /Jacob Carlborg
Re: std.benchmark ready for review. Manager sought after
Andrei Alexandrescu wrote: I finally found the time to complete std.benchmark. I got to a very simple API design, starting where I like it: one line of code. Code is in the form of a pull request at https://github.com/D-Programming-Language/phobos/pull/529. (There's some noise in there caused by my git n00biness). Documentation is at http://erdani.com/d/web/phobos-prerelease/std_benchmark.html. For algorithms that process sequences, it would be nice to have results represented in cycles per item, This should give more consistent results across different CPU familes and different clock speeds. Specifically, I think about cycles per byte, see http://en.wikipedia.org/wiki/Cycles_per_byte Example: http://www.cryptopp.com/benchmarks.html
Re: Shared library in D on Linux
On 2012-04-08 10:45, Timo Westkämper timo.westkam...@gmail.com wrote: I am still testing which setup gives me reliable shared D libraries which can be used from C. Here is my latest test: * test.d: import std.stdio; extern (C) { void hiD() { writeln(hi from D lib); } } * main.c #include stdio.h #include dlfcn.h #include stdlib.h void main() { void (*hiD)(void); void* handle = dlopen(./libtest.so, RTLD_LAZY); if (handle == NULL) { printf(%s\n, dlerror()); exit(1); } hiD = dlsym(handle, hiD); if (hiD != NULL) { hiD(); } else { printf(hiD is null\n); } dlclose(handle); } * Makefile #!/bin/bash test: #gdc-4.6 -g -c test.d -fPIC -o test.o #gdc-4.6 -shared -o libtest.so -fPIC test.o -lc -nostartfiles dmd -g -c test.d -fPIC ld -shared -o libtest.so test.o -lrt -lphobos2 -lpthread gcc -g main.c -ldl -lpthread ./a.out clean: rm -rf *.so *.o *.out With this setup I get ./libtest.so: undefined symbol: _deh_beg With a fake main method added I get make: *** [test] Segmentation fault This is what I get from gdb Program received signal SIGSEGV, Segmentation fault. 0xb7fd1ed3 in std.stdio.__T7writelnTAyaZ.writeln() (_param_0=...) at /usr/include/d/dmd/phobos/std/stdio.d:1550 1550 enforce(fprintf(.stdout.p.handle, %.*s\n, Event with the following startup hooks extern(C) { void gc_init(); void gc_term(); void _init() { gc_init(); } void _fini() { gc_term(); } } I get the same error. I am using dmd 2.058 on Ubuntu 11.10 (32 bit) With gdc I get different errors, but it seems even more difficult to get it working. Does anyone know what is missing to get proper shared library support working on Linux? This is what I can think of for now: * Proper initialization of TLS data * Setting up exception handling tables * Setting up module info -- /Jacob Carlborg
Re: Small Buffer Optimization for string and friends
On 2012-04-08 07:56, Andrei Alexandrescu wrote: For this to happen, we need to start an effort of migrating built-in arrays into runtime, essentially making them templates that the compiler lowers to. So I have two questions: Just don't make the same mistake as with AA. -- /Jacob Carlborg
Re: custom attribute proposal (yeah, another one)
Am Fri, 06 Apr 2012 16:53:56 +0200 schrieb Timon Gehr timon.g...@gmx.ch: On 04/06/2012 04:23 PM, Steven Schveighoffer wrote: Why should we be restricted to only structs? Or any type for that matter? A restriction to only structs is not a restriction because structs can have arbitrary field types. +1 The benefit to using CTFE functions is that the compiler already knows how to deal with them at compile-time. i.e. less work to make the compiler implement this. It is exactly the same amount of work because CTFE is able to execute struct constructors. Yep. Well, in the end the older idea of attributes == structs/classes is - as Adam D. Ruppe said - exchanging the constructor for a function call. You can do the same as with structs/classes and a little more (return basic types). IDEs can work with this proposal. And if we have to prepend @attribute to our CTFE-attribute-functions or attribute-structs that's fine, too with me. :) While I give Steven a virtual karma point for making it clear that not everything is an object (or struct) with his idea, I prefer structs-only over functions, because we could augment these structs with compiler recognized methods/fields at a later point. Think of the range interface here, which is implicit. Some examples: @attribute struct MyAttribute { // this attribute-struct is allowed multiple times on a symbol enum bool allowMultiple = true; // cannot be used on classes, but structs, methods and fields enum uint appliesTo = Fields | Methods | Structs; // Run any action in context of the filled attribute structure and // the type it is applied to. With CTFE I/O, it could generate // binaries or text files with bindings. void onInvokation(T)() { … } void invariant() { static assert(author, Author must be given.) } string author; string email = no email; } @MyAttribute(author = Some Name, email = some@email) @MyAttribute(author = Someone Else, email = else@email) struct Test { … } All of this is nice-to-have at most. I imagine it becomes interesting when attributes are offered by libraries, and you want to give the user of your attributes some validation and convenience. E.g. the compiler can guarantee that only allowMultiple-attributes appear multiple times, so you can focus on implementing the logic instead of checking for usage errors. -- Marco
Re: readonly storage class
Benjamin Thaut wrote: The only thing the function cares about is, that it will not change the data passed to it. It would be kind of nice to have a thrid storage class readonly. Const is really a readonly view of data. I think that immutable should be named const, and const should be named readonly, so they won't cause confusion. If you need a function that don't change data just mark parameters as const. All mutable, const and immutable types are implicitly convertible to const. It can not be casted back to mutable and it can not be implicitly shared among threads, but both const and immutable implicitly convert to readonly, because both of these storage classes lose one of their properties during conversion. That way you only have to write the function once and can pass both const and immutable data to it. Yes, this is how const (as readonly) works. If you think about readonly, use const. The only drawback are the names.
A modest proposal: eliminate template code bloat
I think it's been ages since I meant to ask why nobody (as in compiler vendors) does what I think is rather simple optimization. In the short term the plan is to introduce a link-time flavored optimization at code generation or (better) link step. For simplicity let's assume compiler does all of the following during generation of an object file. 1. Every time a function is generated (or pretty much any symbol) not only a size calculated but also a checksum* of it's data. (If we go for link-time optimization we should find a place to stick it to in the object file) 2. Compiler maintains a hash-table of symbol_size --- list( ~ array) of pairs (references to data, checksum) of all symbols with given size. Call it a duplicate table. Every function generated and more generally global immutable data should end up there. 3. After any function was generated compiler checks an entry in the duplicate table that matches size, followed by matching checksum and only then (if required) doing a straight memcmp. If it happens that there is a match compiler just throws generated code away and _aliases_ it's symbol to that of a matched entry. (so there has to be an alias table if there isn't one already) *Yes, checksum. I think it should be real simple and easy to parallel hash function. The original checksum is no reliable but some amount balancing and profiling are obviously required when picking this function. Applicability: It's not only const-immutable bloat, it can be alleviated with inout. Yet there are plenty of places the exact same code is being generated: e.g. sealed containers of int vs uint, std.find on dchar[] vs uint[]/int[] an so on. In general, the coarse grained parametrization is the root of all evil and it is inevitable since we are just humans after all. Notes: 1. If we do checksum calculation on the fly during codegen it gets at virtually no cost as the data is in CPU data cache. Preliminary version can avoid hacking this part of backend though. 2. By _alias_ I mean the ability of compiler to emit references to a given symbol as if it was some other symbol (should be really straight forward). 3. Linker have more data and is able to achieve colossal size savings, essentially running through the same algorithm before actually linking things. Again it's easily separable step and can be an opt-in. 4. Ironically the same exact thing works with any kind of immutable data structures. It looks like string pooling is superseded by this proposal. Thoughts? -- Dmitry Olshansky
Re: Precise GC
On 08-04-2012 12:07, Rainer Schuetze wrote: On 4/8/2012 11:21 AM, Timon Gehr wrote: On 04/08/2012 10:45 AM, Timon Gehr wrote: That actually sounds like a pretty awesome idea. I understand that the stack will still have to be scanned conservatively, but how does the scheme deal with closures? I guess the compiler should generate an (anonymous) struct type corresponding to the closure data layout. There probably has to be a template for compiler generated structs or classes anyway. This new type could also be used as the type of the context pointer, so a debugger could display the closure variables. This sounds sensible to me. No reason closure marking can't be precise if the compiler just emits the relevant type info (pretty much any other compiler with closures does this; see C#, F#, etc). -- - Alex
Re: readonly storage class
Piotr Szturmaj wrote: If you need a function that don't change data just mark parameters as ^ doesn't
Re: Precise GC
On 08-04-2012 11:42, Manu wrote: On 8 April 2012 11:56, Timon Gehr timon.g...@gmx.ch mailto:timon.g...@gmx.ch wrote: On 04/08/2012 10:45 AM, Timon Gehr wrote: That actually sounds like a pretty awesome idea. Make sure that the compiler does not actually rely on the fact that the template generates a function. The design should include the possibility of just generating tables. It all should be completely transparent to the compiler, if that is possible. This sounds important to me. If it is also possible to do the work with generated tables, and not calling thousands of indirect functions in someone's implementation, it would be nice to reserve that possibility. Indirect function calls in hot loops make me very nervous for non-x86 machines. Yes, I agree here. The last thing we need is a huge amount of kinda-sorta-virtual function calls on ARM, MIPS, etc. It may work fine on x86, but anywhere else, it's really not what you want in a GC. -- - Alex
Re: Issue with module destructor order
On Mar 26, 2012 5:11 AM, Benjamin Thaut c...@benjamin-thaut.de wrote: Is this intended behaviour or is this a bug? I assume this happens because of the mixin template and the public import. I'm using dmd 2.058. -- Kind Regards Benjamin Thaut I don't think the order of destructors is defined. There would be no way to have semantic control because it wouldn't work when you link different files together. The only solution would be to have the compiler analyse the code and figure out what should be destructed first which would be and impressive feat. The solution to solving your problem is not to close the file object in the destructor and let the OS clean it up when your program terminates.
Re: Custom attributes (again)
Am Sun, 08 Apr 2012 12:44:17 +0200 schrieb Jacob Carlborg d...@me.com: On 2012-04-08 09:27, Marco Leise wrote: I don't want this thread to disappear. The ideas presented here have common basic features among the nice-to-haves. For these to work it would require: - user annotations to functions/methods/structs/classes - only CTFE support (as annotations don't change at runtime) I don't see why the attributes should be accessible at runtime. Even if they're read-only it's still good to be able to read the attributes at runtime. Yeah, it was supposed to mean it requires CTFE support, runtime support is possible :) -- Marco
Re: A modest proposal: eliminate template code bloat
Am Sun, 08 Apr 2012 15:01:56 +0400 schrieb Dmitry Olshansky dmitry.o...@gmail.com: I think it's been ages since I meant to ask why nobody (as in compiler vendors) does what I think is rather simple optimization. In the short term the plan is to introduce a link-time flavored optimization at code generation or (better) link step. For simplicity let's assume compiler does all of the following during generation of an object file. 1. Every time a function is generated (or pretty much any symbol) not only a size calculated but also a checksum* of it's data. (If we go for link-time optimization we should find a place to stick it to in the object file) 2. Compiler maintains a hash-table of symbol_size --- list( ~ array) of pairs (references to data, checksum) of all symbols with given size. Call it a duplicate table. Every function generated and more generally global immutable data should end up there. 3. After any function was generated compiler checks an entry in the duplicate table that matches size, followed by matching checksum and only then (if required) doing a straight memcmp. If it happens that there is a match compiler just throws generated code away and _aliases_ it's symbol to that of a matched entry. (so there has to be an alias table if there isn't one already) *Yes, checksum. I think it should be real simple and easy to parallel hash function. The original checksum is no reliable but some amount balancing and profiling are obviously required when picking this function. Applicability: It's not only const-immutable bloat, it can be alleviated with inout. Yet there are plenty of places the exact same code is being generated: e.g. sealed containers of int vs uint, std.find on dchar[] vs uint[]/int[] an so on. In general, the coarse grained parametrization is the root of all evil and it is inevitable since we are just humans after all. Notes: 1. If we do checksum calculation on the fly during codegen it gets at virtually no cost as the data is in CPU data cache. Preliminary version can avoid hacking this part of backend though. 2. By _alias_ I mean the ability of compiler to emit references to a given symbol as if it was some other symbol (should be really straight forward). 3. Linker have more data and is able to achieve colossal size savings, essentially running through the same algorithm before actually linking things. Again it's easily separable step and can be an opt-in. 4. Ironically the same exact thing works with any kind of immutable data structures. It looks like string pooling is superseded by this proposal. Thoughts? Thoughts? Nothing much. I thought of that a while ago, but as an external program, that finds function calls by disassembling and removing dead/duplicate code. So I agree with you. A similar feature is a CTFE cache (or general code cache) that checksums a function's source code and gets the compiled version from a cache. Template bloat could be especially important to 'fix' on embedded systems. But I don't consider it important enough at the moment. :/ Let's wait till the bugs and important features are implemented or hack the compiler ourselves. -- Marco
Re: Issue with module destructor order
On 04/08/2012 02:14 PM, Kevin Cox wrote: On Mar 26, 2012 5:11 AM, Benjamin Thaut c...@benjamin-thaut.de mailto:c...@benjamin-thaut.de wrote: Is this intended behaviour or is this a bug? I assume this happens because of the mixin template and the public import. I'm using dmd 2.058. -- Kind Regards Benjamin Thaut I don't think the order of destructors is defined. There would be no way to have semantic control because it wouldn't work when you link different files together. The only solution would be to have the compiler analyse the code and figure out what should be destructed first which would be and impressive feat. Actually that is what is normally done. (it is quite conservative though, cyclical imports where multiple modules have static constructors/destructors are just disallowed and terminate the program on startup.) The solution to solving your problem is not to close the file object in the destructor and let the OS clean it up when your program terminates. This would be a workaround.
Re: Small Buffer Optimization for string and friends
On Sun, Apr 08, 2012 at 12:56:38AM -0500, Andrei Alexandrescu wrote: [...] 1. What happened to the new hash project? We need to take that to completion. [...] Sorry, I've been busy at work and haven't had too much free time to work on it. The current code is available on github: https://github.com/quickfur/New-AA-implementation The major outstanding issues are: - Qualified keys not fully working: the current code has a few corner cases that don't work with shared/immutable/inout keys. One major roadblock is how to implement this: alias someType T; inout(T) myFunc(inout(T) arg, ...) { int[inout(T)] aa; ... } The problem is that inout gets carried over into the AA template, which breaks because it instantiates into something that has: struct Slot { hash_t hash; inout(T) key; // -- this causes a compile error Value value; } Ideally, AA keys should all be stored as immutable inside the AA, and automatically converted to/from the qualified type the user specified. - Template bloat: the current code uses template member functions, and will instantiate a new function for every implicit conversion of input key types. This also depends on IFTI, which has some quirks (compiler bugs) that make the code ugly (e.g., strings and arrays not treated equally by the compiler, requiring hacks to make implicit conversion work). Timon has suggested an alternative way of handling implicit conversions, which I think is better, but I need to take some time to actually implement it. - Static initialization of AA's (AA literals that compile directly into object code). This should be possible in principle, but I've run into what may be a CTFE bug that prevents it from working. - A not-so-major issue is to finish the toHash() implementations for all native types (currently it works for some common key types, but coverage is still incomplete). Once this is done, we can finally get rid of getHash from TypeInfo; UFCS will let us simply write x.toHash() for pretty much any type x. Once these issues are resolved, there remains the major task of actually integrating this code with druntime/dmd. A lot of work is expected on the dmd end, because of the current amount of hacks in dmd to make AA's work. T -- Valentine's Day: an occasion for florists to reach into the wallets of nominal lovers in dire need of being reminded to profess their hypothetical love for their long-forgotten.
Re: Small Buffer Optimization for string and friends
On Sunday, 8 April 2012 at 13:53:07 UTC, H. S. Teoh wrote: On Sun, Apr 08, 2012 at 12:56:38AM -0500, Andrei Alexandrescu wrote: [...] 1. What happened to the new hash project? We need to take that to completion. [...] Sorry, I've been busy at work and haven't had too much free time to work on it. The current code is available on github: https://github.com/quickfur/New-AA-implementation The major outstanding issues are: - Qualified keys not fully working: the current code has a few corner cases that don't work with shared/immutable/inout keys. One major roadblock is how to implement this: alias someType T; inout(T) myFunc(inout(T) arg, ...) { int[inout(T)] aa; ... } The problem is that inout gets carried over into the AA template, which breaks because it instantiates into something that has: struct Slot { hash_t hash; inout(T) key; // -- this causes a compile error Value value; } Ideally, AA keys should all be stored as immutable inside the AA, and automatically converted to/from the qualified type the user specified. - Template bloat: the current code uses template member functions, and will instantiate a new function for every implicit conversion of input key types. This also depends on IFTI, which has some quirks (compiler bugs) that make the code ugly (e.g., strings and arrays not treated equally by the compiler, requiring hacks to make implicit conversion work). Timon has suggested an alternative way of handling implicit conversions, which I think is better, but I need to take some time to actually implement it. - Static initialization of AA's (AA literals that compile directly into object code). This should be possible in principle, but I've run into what may be a CTFE bug that prevents it from working. - A not-so-major issue is to finish the toHash() implementations for all native types (currently it works for some common key types, but coverage is still incomplete). Once this is done, we can finally get rid of getHash from TypeInfo; UFCS will let us simply write x.toHash() for pretty much any type x. Once these issues are resolved, there remains the major task of actually integrating this code with druntime/dmd. A lot of work is expected on the dmd end, because of the current amount of hacks in dmd to make AA's work. T doesn't this work? immutable std.traits.Unqual!(inout(T)) key;
Re: A modest proposal: eliminate template code bloat
On Sun, Apr 08, 2012 at 03:01:56PM +0400, Dmitry Olshansky wrote: I think it's been ages since I meant to ask why nobody (as in compiler vendors) does what I think is rather simple optimization. In the short term the plan is to introduce a link-time flavored optimization at code generation or (better) link step. This would be incompatible with how current (non-dmd) linkers work. But I do like the idea. Perhaps if it works well, other linkers will adopt it? (Just like how the gcc linker adopted duplicate template code elimination due to C++ templates.) For simplicity let's assume compiler does all of the following during generation of an object file. 1. Every time a function is generated (or pretty much any symbol) not only a size calculated but also a checksum* of it's data. (If we go for link-time optimization we should find a place to stick it to in the object file) We'd have to make sure the checksum doesn't end up in the final executable though, otherwise the bloat may negate any gains we've made. 2. Compiler maintains a hash-table of symbol_size --- list( ~ array) of pairs (references to data, checksum) of all symbols with given size. Call it a duplicate table. Every function generated and more generally global immutable data should end up there. 3. After any function was generated compiler checks an entry in the duplicate table that matches size, followed by matching checksum and only then (if required) doing a straight memcmp. If it happens that there is a match compiler just throws generated code away and _aliases_ it's symbol to that of a matched entry. (so there has to be an alias table if there isn't one already) I think you don't even need an alias table; IIRC the OS dynamic linker can easily handle symbols that have the same value (i.e. that point to the same place). All you have to do is to change the value of the duplicated symbols so that they all point to the same address. [...] Applicability: It's not only const-immutable bloat, it can be alleviated with inout. Yet there are plenty of places the exact same code is being generated: e.g. sealed containers of int vs uint, std.find on dchar[] vs uint[]/int[] an so on. In general, the coarse grained parametrization is the root of all evil and it is inevitable since we are just humans after all. I'm not sure I understand the last sentence there, but duplicate code elimination is definitely a big plus. It will also mean that we can use templates more freely without having to worry about template bloat. Notes: 1. If we do checksum calculation on the fly during codegen it gets at virtually no cost as the data is in CPU data cache. Preliminary version can avoid hacking this part of backend though. 2. By _alias_ I mean the ability of compiler to emit references to a given symbol as if it was some other symbol (should be really straight forward). Like I said, I think this isn't even necessary, if the compiler can just generate the same value for the duplicated symbols. 3. Linker have more data and is able to achieve colossal size savings, essentially running through the same algorithm before actually linking things. Again it's easily separable step and can be an opt-in. This assumes the (maybe external) linker knows how to take advantage of the info. But IMO, linkers *should* be a lot smarter than they currently are, so I don't see a problem with this as long as a dumb linker will still produce a working executable (just without the space savings). Alternatively we can have an external pre-link tool that scans a given set of object files and eliminates duplicated code (by turning duplicated symbols into external references in all but one of the instances), before we hand the files off to the OS's native linker. Come to think of it, this might be a good way to experiment with this idea before we commit lots of effort into integrating it with a real linker. 4. Ironically the same exact thing works with any kind of immutable data structures. It looks like string pooling is superseded by this proposal. [...] Not really... string pooling can take advantage of overlapping (sub)strings, but I don't think you can do that with code. But I think your idea has a lot of merit. I'm for making linkers smarter than they currently are. T -- It always amuses me that Windows has a Safe Mode during bootup. Does that mean that Windows is normally unsafe?
Re: A modest proposal: eliminate template code bloat
On 04/08/12 13:01, Dmitry Olshansky wrote: 3. After any function was generated compiler checks an entry in the duplicate table that matches size, followed by matching checksum and only then (if required) doing a straight memcmp. If it happens that there is a match compiler just throws generated code away and _aliases_ it's symbol to that of a matched entry. (so there has to be an alias table if there isn't one already) [...] Thoughts? Don't forget that this needs to work: static auto f(T)(T a) { return a; } assert(cast(void*)f!int!=cast(void*)f!uint); artur
Re: Small Buffer Optimization for string and friends
On Sun, Apr 08, 2012 at 04:11:10PM +0200, Tove wrote: On Sunday, 8 April 2012 at 13:53:07 UTC, H. S. Teoh wrote: On Sun, Apr 08, 2012 at 12:56:38AM -0500, Andrei Alexandrescu wrote: [...] 1. What happened to the new hash project? We need to take that to completion. [...] [...] The major outstanding issues are: - Qualified keys not fully working: the current code has a few corner cases that don't work with shared/immutable/inout keys. One major roadblock is how to implement this: alias someType T; inout(T) myFunc(inout(T) arg, ...) { int[inout(T)] aa; ... } The problem is that inout gets carried over into the AA template, which breaks because it instantiates into something that has: struct Slot { hash_t hash; inout(T) key; // -- this causes a compile error Value value; } [...] doesn't this work? immutable std.traits.Unqual!(inout(T)) key; I suppose so, but the problem is more with the return type of various AA functions. If the user writes int[inout(T)] aa, then he would expect that aa.keys should return inout(T)[]. But the problem here is that this is impossible to express in the current system, because inout has a different meaning when you write it in the AA method, than its meaning in the context of the calling function. For example, this doesn't quite work: struct AA { ... inout(Key) keys() { ... // Problem: inout here doesn't mean what it // means in the context of the caller } } T -- Tech-savvy: euphemism for nerdy.
Re: Small Buffer Optimization for string and friends
On 8 April 2012 12:54, Manu turkey...@gmail.com wrote: On 8 April 2012 12:46, Vladimir Panteleev vladi...@thecybershadow.netwrote: On Sunday, 8 April 2012 at 05:56:36 UTC, Andrei Alexandrescu wrote: Walter and I discussed today about using the small string optimization in string and other arrays of immutable small objects. On 64 bit machines, string occupies 16 bytes. We could use the first byte as discriminator, which means that all strings under 16 chars need no memory allocation at all. Don't use the first byte. Use the last byte. The last byte is the highest-order byte of the length. Limiting arrays to 18.37 exabytes, as opposed to 18.45 exabytes, is a much nicer limitation than making assumptions about the memory layout. What is the plan for 32bit? The only way I can see this working is if the marker happens to be the top bit of the size (limiting arrays to 2gb on 32bit systems, which is probably fine), and if set, the next 7 bits are the size, leaving 7 bytes for a string... 7 bytes is pushing it, 15 bytes is very useful, 7 is borderline... That all said, I'll ultimately end out with my own string type anyway which is multiples of, and aligned to 16 bytes, which will support SSE string opcodes. I wonder if these considerations can be factored into the built-in string? Is it realistic that anyone can actually use raw d-string's in an app that performs a lot of string manipulation? I bet most people end up with a custom string class anyway... Who's written a string-heavy app without their own string helper class? I ended up with a string class within about half an hour of trying to work with D strings (initially just to support stack strings, then it grew).
Re: Small Buffer Optimization for string and friends
On 4/8/12 1:33 AM, Daniel Murphy wrote: - This has been a disaster for AAs Making them magic has been the problem. We must revert that. - Is it worth doing for 32 bit? Probably not. - Would generate false pointers Fair point but we're also moving to precise collection :o). - Run-time check on every array access? Cost is negligible in most cases according to the extensive measurements we've done. - Why should this be in the language/compiler instead of phobos? People use built-in immutable(char)[] as strings. We want to benefit existing uses. Andrei
Re: std.benchmark ready for review. Manager sought after
On 4/8/12 2:02 AM, Vladimir Panteleev wrote: The benchmark_relative_ prefix makes sense for function names (for scheduleForBenchmarking), but not so much for string literals for benchmark names. The string literal benchmark_relative_file read looks like the words benchmark relative file are grouped together, with read added on. So, my suggestion would be to wrap the benchmark_relative_ prefix - when used with benchmark name strings - into a semantical function / enum / etc. In my example above, relativeBenchmark would be: string relativeBenchmark(string s) { return benchmark_relative_ ~ s; } I suppose it can be summed up as a tradeoff between complexity (you need to explain both the function name usage and the relativeBenchmark wrapper usage) vs. code prettiness. I understand, thanks. Andrei
Re: std.benchmark ready for review. Manager sought after
On 4/8/12 3:16 AM, Somedude wrote: Like it. Would it be a good idea to add a column with an average memory used ? Interesting idea. I saw http://stackoverflow.com/questions/1674652/c-c-memory-usage-api-in-linux-windows and it looks like it's not an easy problem. Should we make this part of the initial release? Andrei
Re: Discussion on Go and D
On 4/8/12 4:20 AM, Andrej Mitrovic wrote: On 4/7/12, Andrei Alexandrescuseewebsiteforem...@erdani.org wrote: 3. The ability to dispose of memory will disappear along with the delete keyword. Pull this and hopefully that myth will come to an end: https://github.com/D-Programming-Language/d-programming-language.org/pull/112 Done. Andrei
Re: Small Buffer Optimization for string and friends
On 4/8/12 4:46 AM, Vladimir Panteleev wrote: On Sunday, 8 April 2012 at 05:56:36 UTC, Andrei Alexandrescu wrote: Walter and I discussed today about using the small string optimization in string and other arrays of immutable small objects. On 64 bit machines, string occupies 16 bytes. We could use the first byte as discriminator, which means that all strings under 16 chars need no memory allocation at all. Don't use the first byte. Use the last byte. The last byte is the highest-order byte of the length. Limiting arrays to 18.37 exabytes, as opposed to 18.45 exabytes, is a much nicer limitation than making assumptions about the memory layout. Hehe. Good idea. On big endian machines the last byte is the ticket! Andrei
Re: Small Buffer Optimization for string and friends
On 4/8/12 4:30 AM, Andrej Mitrovic wrote: On 4/8/12, Andrei Alexandrescuseewebsiteforem...@erdani.org wrote: essentially making them templates I just hope that doesn't cause: 1) Awful template errors 2) Slower build times 3) More ICEs Walter and I agree that relying on sheer D for instead of compiler-generated magic (in this case, bitmaps etc) is the better solution. Implementing a generic marker as a template using introspection is very simple - I predict a few dozen lines. Once that is finished there will obviously be no template errors. I don't know whether build times would be impacted. There will be fewer ICEs because there will be less reliance on the compiler. Andrei
Re: Shared library in D on Linux
On 04/08/2012 03:45 AM, Timo Westkämper timo.westkam...@gmail.com wrote: extern(C) { void gc_init(); void gc_term(); void _init() { gc_init(); } void _fini() { gc_term(); } } I think you want rt_init and rt_term here.
Re: Small Buffer Optimization for string and friends
On 4/8/12 4:55 AM, Vladimir Panteleev wrote: On Sunday, 8 April 2012 at 09:46:28 UTC, Vladimir Panteleev wrote: On Sunday, 8 April 2012 at 05:56:36 UTC, Andrei Alexandrescu wrote: Walter and I discussed today about using the small string optimization in string and other arrays of immutable small objects. On 64 bit machines, string occupies 16 bytes. We could use the first byte as discriminator, which means that all strings under 16 chars need no memory allocation at all. Don't use the first byte. Use the last byte. The last byte is the highest-order byte of the length. Limiting arrays to 18.37 exabytes, as opposed to 18.45 exabytes, is a much nicer limitation than making assumptions about the memory layout. Erm... never mind, I thought the pointer was the first field. Me too, actually. Even so, how would you use the lowest-order byte of the length as a discriminator? Unless you allocated bytes 2-8 for the length (which would be an unaligned read, or a shift every time...) It's amazing how fast shift works :o). making assumptions about the memory layout now seems like the only solution. Also, what will be .ptr for such arrays? It can't point inside the string, because the type is immutable and the array is on the stack. Duplication can be expensive as well. Once anyone asks for .ptr a conservative copy will be made. Andrei
Re: Small Buffer Optimization for string and friends
On 4/8/12 9:49 AM, Andrei Alexandrescu wrote: On 4/8/12 4:46 AM, Vladimir Panteleev wrote: On Sunday, 8 April 2012 at 05:56:36 UTC, Andrei Alexandrescu wrote: Walter and I discussed today about using the small string optimization in string and other arrays of immutable small objects. On 64 bit machines, string occupies 16 bytes. We could use the first byte as discriminator, which means that all strings under 16 chars need no memory allocation at all. Don't use the first byte. Use the last byte. The last byte is the highest-order byte of the length. Limiting arrays to 18.37 exabytes, as opposed to 18.45 exabytes, is a much nicer limitation than making assumptions about the memory layout. Hehe. Good idea. On big endian machines the last byte is the ticket! s/big/little/
Re: Small Buffer Optimization for string and friends
On 4/8/12 5:45 AM, Jacob Carlborg wrote: On 2012-04-08 07:56, Andrei Alexandrescu wrote: For this to happen, we need to start an effort of migrating built-in arrays into runtime, essentially making them templates that the compiler lowers to. So I have two questions: Just don't make the same mistake as with AA. The mistake with AAs was done long ago, but it was forced as AAs predated templates. Andrei
Re: Small Buffer Optimization for string and friends
On 4/8/12 4:54 AM, Manu wrote: On 8 April 2012 12:46, Vladimir Panteleev vladi...@thecybershadow.net mailto:vladi...@thecybershadow.net wrote: On Sunday, 8 April 2012 at 05:56:36 UTC, Andrei Alexandrescu wrote: Walter and I discussed today about using the small string optimization in string and other arrays of immutable small objects. On 64 bit machines, string occupies 16 bytes. We could use the first byte as discriminator, which means that all strings under 16 chars need no memory allocation at all. Don't use the first byte. Use the last byte. The last byte is the highest-order byte of the length. Limiting arrays to 18.37 exabytes, as opposed to 18.45 exabytes, is a much nicer limitation than making assumptions about the memory layout. What is the plan for 32bit? We can experiment with making strings shorter than 8 chars in-situ. The drawback will be that length will be limited to 29 bits, i.e. 512MB. Andrei
Re: std.benchmark ready for review. Manager sought after
On 4/8/12 5:46 AM, Piotr Szturmaj wrote: Andrei Alexandrescu wrote: I finally found the time to complete std.benchmark. I got to a very simple API design, starting where I like it: one line of code. Code is in the form of a pull request at https://github.com/D-Programming-Language/phobos/pull/529. (There's some noise in there caused by my git n00biness). Documentation is at http://erdani.com/d/web/phobos-prerelease/std_benchmark.html. For algorithms that process sequences, it would be nice to have results represented in cycles per item, This should give more consistent results across different CPU familes and different clock speeds. Specifically, I think about cycles per byte, see http://en.wikipedia.org/wiki/Cycles_per_byte Example: http://www.cryptopp.com/benchmarks.html The framework aims at generality and simplicity. I'd rather have it simple and universal rather than subtle. Andrei
Re: Small Buffer Optimization for string and friends
On 2012-04-08 05:56:38 +, Andrei Alexandrescu seewebsiteforem...@erdani.org said: Walter and I discussed today about using the small string optimization in string and other arrays of immutable small objects. On 64 bit machines, string occupies 16 bytes. We could use the first byte as discriminator, which means that all strings under 16 chars need no memory allocation at all. It turns out statistically a lot of strings are small. According to a variety of systems we use at Facebook, the small buffer optimization is king - it just works great in all cases. In D that means better speed, better locality, and less garbage. Small buffer optimization is a very good thing to have indeed. But… how can you preserve existing semantics? For instance, let's say you have this: string s = abcd; which is easily representable as a small string. Do you use the small buffer optimization in the assignment? That seems like a definitive yes. But as soon as you take a pointer to that string, you break the immutability guaranty: immutable(char)[] s = abcd; immutable(char)* p = s.ptr; s = defg; // assigns to where? There's also the issue of this code being legal currently: immutable(char)* getPtr(string s) { return s.ptr; } If you pass a small string to getPtr, it'll be copied to the local stack frame and you'll be returning a pointer to that local copy. You could mitigate this by throwing an error when trying to get the pointer to a small string, but then you have to disallow taking the pointer of a const(char)[] pointing to it: const(char)* getPtr2(const(char)[] s) { return s.ptr; } const(char)* getAbcdPtr() { string s = abcd; // s implicitly converted to regular const(char)[] pointing to local stack frame const(char)* c = getPtr2(s); // c points to the storage of s, which is the local stack frame return c; } So it's sad, but I am of the opinion that the only way to implement small buffer optimization is to have a higher-level abstraction, a distinct type for such small strings. -- Michel Fortin michel.for...@michelf.com http://michelf.com/
Re: Small Buffer Optimization for string and friends
On 8 April 2012 17:52, Andrei Alexandrescu seewebsiteforem...@erdani.orgwrote: On 4/8/12 4:54 AM, Manu wrote: On 8 April 2012 12:46, Vladimir Panteleev vladi...@thecybershadow.net mailto:vladimir@**thecybershadow.net vladi...@thecybershadow.net wrote: On Sunday, 8 April 2012 at 05:56:36 UTC, Andrei Alexandrescu wrote: Walter and I discussed today about using the small string optimization in string and other arrays of immutable small objects. On 64 bit machines, string occupies 16 bytes. We could use the first byte as discriminator, which means that all strings under 16 chars need no memory allocation at all. Don't use the first byte. Use the last byte. The last byte is the highest-order byte of the length. Limiting arrays to 18.37 exabytes, as opposed to 18.45 exabytes, is a much nicer limitation than making assumptions about the memory layout. What is the plan for 32bit? We can experiment with making strings shorter than 8 chars in-situ. The drawback will be that length will be limited to 29 bits, i.e. 512MB. 29 bits? ...not 31? How does this implementation actually work? On 32/64 bits, and little/big endian? I can only imagine it working with a carefully placed 1 bit. bit-0 of the size on little endian, and bit-31 of the size on big endian. That should only halve the address range (leaving 31 bits)... where did the other 2 bits go? I also hope this only affects slices of chars? It will ignore this behaviour for anything other than char arrays right?
Re: Small Buffer Optimization for string and friends
On 4/8/12 8:54 AM, H. S. Teoh wrote: - Qualified keys not fully working: the current code has a few corner cases that don't work with shared/immutable/inout keys. One major roadblock is how to implement this: alias someType T; inout(T) myFunc(inout(T) arg, ...) { int[inout(T)] aa; ... } I wonder how frequently such code is used. Andrei
Re: Small Buffer Optimization for string and friends
On 4/8/12 8:54 AM, H. S. Teoh wrote: Sorry, I've been busy at work and haven't had too much free time to work on it. The current code is available on github: https://github.com/quickfur/New-AA-implementation Thanks for the update! Let me reiterate this is important work for many reasons, so it would be great if you got around to pushing it through. Do you think someone else in the community could help you? Andrei
Re: Small Buffer Optimization for string and friends
On 4/8/12 9:59 AM, Michel Fortin wrote: But as soon as you take a pointer to that string, you break the immutability guaranty: immutable(char)[] s = abcd; immutable(char)* p = s.ptr; s = defg; // assigns to where? Taking .ptr will engender a copy. A small regression will be that address of individual chars cannot be taken. Andrei
Re: Small Buffer Optimization for string and friends
On 4/8/12 10:03 AM, Manu wrote: 29 bits? ...not 31? Sorry, 31 indeed. How does this implementation actually work? On 32/64 bits, and little/big endian? I can only imagine it working with a carefully placed 1 bit. bit-0 of the size on little endian, and bit-31 of the size on big endian. That should only halve the address range (leaving 31 bits)... where did the other 2 bits go? Essentially it will use either the first or the last bit of the representation as discriminator. That bit is most likely taken from the length representation. Shifting and masking can easily account for it when computing length of large strings. I also hope this only affects slices of chars? It will ignore this behaviour for anything other than char arrays right? It works for any arrays of sufficiently small immutable data type (e.g. immutable(byte)[]), but the most advantage is reaped for string. Andrei
Re: Small Buffer Optimization for string and friends
On 4/8/12 9:26 AM, Manu wrote: Is it realistic that anyone can actually use raw d-string's in an app that performs a lot of string manipulation? Yes. I bet most people end up with a custom string class anyway... That does happen, but much more rarely than one might think. Who's written a string-heavy app without their own string helper class? I ended up with a string class within about half an hour of trying to work with D strings (initially just to support stack strings, then it grew). A lot of people write string-heavy apps with the built-in strings. Heavy actually describes a continuum. No matter how you put it, improving the performance of built-in strings is beneficial. Andrei
Re: A modest proposal: eliminate template code bloat
On 08.04.2012 18:21, Artur Skawina wrote: On 04/08/12 13:01, Dmitry Olshansky wrote: 3. After any function was generated compiler checks an entry in the duplicate table that matches size, followed by matching checksum and only then (if required) doing a straight memcmp. If it happens that there is a match compiler just throws generated code away and _aliases_ it's symbol to that of a matched entry. (so there has to be an alias table if there isn't one already) [...] Thoughts? Don't forget that this needs to work: static auto f(T)(T a) { return a; } assert(cast(void*)f!int!=cast(void*)f!uint); artur A reference to spec page plz. -- Dmitry Olshansky
Re: Small Buffer Optimization for string and friends
On 2012-04-08 16:54, Andrei Alexandrescu wrote: On 4/8/12 5:45 AM, Jacob Carlborg wrote: On 2012-04-08 07:56, Andrei Alexandrescu wrote: For this to happen, we need to start an effort of migrating built-in arrays into runtime, essentially making them templates that the compiler lowers to. So I have two questions: Just don't make the same mistake as with AA. The mistake with AAs was done long ago, but it was forced as AAs predated templates. Andrei I'm referring to the new template implementation of AAs that got reverted due everything breaking, if I recall correctly. -- /Jacob Carlborg
Re: std.benchmark ready for review. Manager sought after
On 08.04.2012 12:16, Somedude wrote: [snip] Like it. Would it be a good idea to add a column with an average memory used ? In general it's next to impossible and/or entirely OS-specific. What can be done I think is adding a query function to GC interface that returns amount of RAM currently allocated on it's heap. So adding GC heap usage can be done albeit it's not really a stable metric. This way one gets a nice hint on why something is slow ;) -- Dmitry Olshansky
Re: Small Buffer Optimization for string and friends
On 2012-04-08 15:06:13 +, Andrei Alexandrescu seewebsiteforem...@erdani.org said: On 4/8/12 9:59 AM, Michel Fortin wrote: But as soon as you take a pointer to that string, you break the immutability guaranty: immutable(char)[] s = abcd; immutable(char)* p = s.ptr; s = defg; // assigns to where? Taking .ptr will engender a copy. A small regression will be that address of individual chars cannot be taken. You know, many people have been wary of hidden memory allocations in the past. That's not going to make them happy. I'm not complaining, but I think .ptr should return null in those cases. Let people use toStringz when they need a C string, and let people deal with the ugly details themselves if they're using .ptr to bypass array bound checking. Because if someone used .ptr somewhere to bypass bound checking and instead he gets a memory allocation at each loop iteration… it won't be pretty. And what about implicit conversions to const(char)[]? That too will require a copy, because otherwise it could point to the local stack frame where your immutable(char)[] resides. That said, maybe copies of small-string optimized immutable(char)[] could be small-string optimized const(char)[]. That'd not have any side effect since no one can have a mutable pointer/slice to the const copy anyway. -- Michel Fortin michel.for...@michelf.com http://michelf.com/
Re: Shared library in D on Linux
I'm interessting in the same stuff. I've a question to _tlsend and _tlsstart. What are they used for? My disputable presumption was that they point to the begin of TLS segment. What is _deh_begin and _deh_end used for?
Re: Small Buffer Optimization for string and friends
On 4/8/12 10:48 AM, Michel Fortin wrote: On 2012-04-08 15:06:13 +, Andrei Alexandrescu seewebsiteforem...@erdani.org said: On 4/8/12 9:59 AM, Michel Fortin wrote: But as soon as you take a pointer to that string, you break the immutability guaranty: immutable(char)[] s = abcd; immutable(char)* p = s.ptr; s = defg; // assigns to where? Taking .ptr will engender a copy. A small regression will be that address of individual chars cannot be taken. You know, many people have been wary of hidden memory allocations in the past. Well, the optimization makes for fewer allocations total. In fact .ptr does the allocation that was formerly mandatory. That's not going to make them happy. I'm not complaining, but I think .ptr should return null in those cases. That would be too large a regression I think. And it's not detectable during compilation. Let people use toStringz when they need a C string, and let people deal with the ugly details themselves if they're using .ptr to bypass array bound checking. Because if someone used .ptr somewhere to bypass bound checking and instead he gets a memory allocation at each loop iteration… it won't be pretty. Only one allocation. First invocation of .ptr effectively changes representation. And what about implicit conversions to const(char)[]? That too will require a copy, because otherwise it could point to the local stack frame where your immutable(char)[] resides. That said, maybe copies of small-string optimized immutable(char)[] could be small-string optimized const(char)[]. That'd not have any side effect since no one can have a mutable pointer/slice to the const copy anyway. I think casting to const(char)[] should work without allocation. Andrei
Re: A modest proposal: eliminate template code bloat
On 04/08/12 17:20, Dmitry Olshansky wrote: On 08.04.2012 18:21, Artur Skawina wrote: On 04/08/12 13:01, Dmitry Olshansky wrote: 3. After any function was generated compiler checks an entry in the duplicate table that matches size, followed by matching checksum and only then (if required) doing a straight memcmp. If it happens that there is a match compiler just throws generated code away and _aliases_ it's symbol to that of a matched entry. (so there has to be an alias table if there isn't one already) [...] Thoughts? Don't forget that this needs to work: static auto f(T)(T a) { return a; } assert(cast(void*)f!int!=cast(void*)f!uint); A reference to spec page plz. A reference to *a* D spec, please... There isn't one, but that does not mean that common sense does not need to apply. Do you really suggest making different template instantiations identical, just because the compiler happened to emit the same code? The situation is not very different from const int a = 1; const uint b = 1; assert(a!=b); The user can take the addresses, compare them, use as AA keys, set breakpoints etc. Note that my point is just that the compiler needs to emit a dummy so that the addresses remain unique, eg module.f!uint: jmp module.f!int that only is used when taking the address. Even calling f!int() instead of the uint version could be acceptable, as long as there is a compiler option to turn this optimization off (think breakpoints). artur
Re: A modest proposal: eliminate template code bloat
Am Sun, 8 Apr 2012 07:18:26 -0700 schrieb H. S. Teoh hst...@quickfur.ath.cx: We'd have to make sure the checksum doesn't end up in the final executable though, otherwise the bloat may negate any gains we've made. Executables (and object files) are made up mostly of sections, some of which are 'special cased' to contain the code, zero initialized data, thread local storage etc. and some user defined. The checksums would most probably end up in their own section, like is already happening for debug info or comments. Using a tool like strip you can remove any section by its name. Once a linker knows how to use the checksums, it would strip them by default. -- Marco
Re: readonly storage class
Am 08.04.2012 11:49, schrieb Simen Kjaeraas: On Sun, 08 Apr 2012 11:39:03 +0200, Benjamin Thaut c...@benjamin-thaut.de wrote: Am 08.04.2012 11:28, schrieb Jonathan M Davis: On Sunday, April 08, 2012 11:16:40 Benjamin Thaut wrote: While typing D code I usually come across the problem that neither const nor immutable describe the usage pattern of the memory I'm currently working on 100%. Sometimes I have immutable data that has been shared among threads that I want to pass to a function. Then I have some const data that I want to pass to the same function. Currently you don't have any other choice but to write that function two times. But the function itself does not need the extended properties of const or immutable: const: can be casted back to mutable immutable: can be implicitly shared among threads The only thing the function cares about is, that it will not change the data passed to it. It would be kind of nice to have a thrid storage class readonly. It can not be casted back to mutable and it can not be implicitly shared among threads, but both const and immutable implicitly convert to readonly, because both of these storage classes lose one of their properties during conversion. That way you only have to write the function once and can pass both const and immutable data to it. Just an idea, comments and critics welcome. I would point out that casting const to mutable and then altering the variable is subverting the type system. The compiler does not support casting away either const or immutable to alter _anything_. So, as far as the type system is concerned, if you want a function that takes both const and immutable, it should take const. Now, you _can_ cast away const and alter a variable if you're careful, but you're subverting the type system when you do so and throwing away any guarantees that the compiler gives you. It's far from safe. Given that casting away const on a variable and then mutating is subverting the type system and thate therefore the compiler is free to assume that you will never do it, I don't see what your idea of readonly would buy us. It's the same as const. - Jonathan M Davis I'll come up with a example. But if what you say is true, why is immutable not implicitly convertible to const? It is. Thanks, then this is a misunderstanding on my side, and this topic is irrelevant. But what about calling const methods on immutable objects? -- Kind Regards Benjamin Thaut
Re: a pretty exciting result for parallel D lang rmd following defrag by name
On Sunday, 8 April 2012 at 09:21:43 UTC, Somedude wrote: Hi, You seem to have done a pretty good job with your parallel unzip. Have you tried a parallel zip as well ? Do you think you could include this in std.zip when you're done ? I'm going to do a parallel zip as well. There is already parallel zip utility available with 7zip, so I haven't looked closely at D's std.zip. These parallel implementations all bring in std.parallelism as a dependency, and I don't know if that is acceptable. I'm just putting them in my github for now, along with examples.
Re: A modest proposal: eliminate template code bloat
Am Sun, 08 Apr 2012 16:21:14 +0200 schrieb Artur Skawina art.08...@gmail.com: On 04/08/12 13:01, Dmitry Olshansky wrote: 3. After any function was generated compiler checks an entry in the duplicate table that matches size, followed by matching checksum and only then (if required) doing a straight memcmp. If it happens that there is a match compiler just throws generated code away and _aliases_ it's symbol to that of a matched entry. (so there has to be an alias table if there isn't one already) [...] Thoughts? Don't forget that this needs to work: static auto f(T)(T a) { return a; } assert(cast(void*)f!int!=cast(void*)f!uint); artur Do you actually rely on that behavior? It is the same as asking this to work, I think: string a = abc; string b = abcdef; assert(a.ptr !is b.ptr); There should be ways to logically check the unequality of your two functions, not by comparing their memory addresses. -- Marco
Re: A modest proposal: eliminate template code bloat
On 4/8/12 10:59 AM, Artur Skawina wrote: On 04/08/12 17:20, Dmitry Olshansky wrote: On 08.04.2012 18:21, Artur Skawina wrote: On 04/08/12 13:01, Dmitry Olshansky wrote: 3. After any function was generated compiler checks an entry in the duplicate table that matches size, followed by matching checksum and only then (if required) doing a straight memcmp. If it happens that there is a match compiler just throws generated code away and _aliases_ it's symbol to that of a matched entry. (so there has to be an alias table if there isn't one already) [...] Thoughts? Don't forget that this needs to work: static auto f(T)(T a) { return a; } assert(cast(void*)f!int!=cast(void*)f!uint); A reference to spec page plz. A reference to *a* D spec, please... There isn't one, but that does not mean that common sense does not need to apply. Doesn't apply to C++. Andrei
Re: std.benchmark ready for review. Manager sought after
Am Sun, 08 Apr 2012 09:35:14 -0500 schrieb Andrei Alexandrescu seewebsiteforem...@erdani.org: On 4/8/12 3:16 AM, Somedude wrote: Like it. Would it be a good idea to add a column with an average memory used ? Interesting idea. I saw http://stackoverflow.com/questions/1674652/c-c-memory-usage-api-in-linux-windows and it looks like it's not an easy problem. Should we make this part of the initial release? Andrei Oh please, I use these functions already and with the similarities amongst operating systems they should be somewhere in Phobos! My thought was more in the line of std.procmgmt/std.process though. Since they are process management functions useful for server monitoring as well, not just benchmarking. -- Marco
Re: A modest proposal: eliminate template code bloat
On 08.04.2012 19:59, Artur Skawina wrote: On 04/08/12 17:20, Dmitry Olshansky wrote: On 08.04.2012 18:21, Artur Skawina wrote: On 04/08/12 13:01, Dmitry Olshansky wrote: 3. After any function was generated compiler checks an entry in the duplicate table that matches size, followed by matching checksum and only then (if required) doing a straight memcmp. If it happens that there is a match compiler just throws generated code away and _aliases_ it's symbol to that of a matched entry. (so there has to be an alias table if there isn't one already) [...] Thoughts? Don't forget that this needs to work: static auto f(T)(T a) { return a; } assert(cast(void*)f!int!=cast(void*)f!uint); A reference to spec page plz. A reference to *a* D spec, please... Yeah, sorry. There isn't one, but that does not mean that common sense does not need to apply. There is common sense and that is (in my book): don't tie up compiler's hands for no benefit. Do you really suggest making different template instantiations identical, just because the compiler happened to emit the same code? The situation is not very different from const int a = 1; const uint b = 1; assert(a!=b); I wouldn't expect this assert to always hold. Moreover (all things being equal) I would expect taking address of constant integers a poor practice. The user can take the addresses, compare them, use as AA keys, set breakpoints etc. Yes, that's what I call poor practice ( I mean function pointer as AA _key_, seriously?). As for breakpoints, obviously one debugs non-optimized program (at least most of the time). Note that my point is just that the compiler needs to emit a dummy so that the addresses remain unique, eg module.f!uint: jmp module.f!int Could work as a conservative option. But I don't think it's justified. that only is used when taking the address. Even calling f!int() instead of the uint version could be acceptable, as long as there is a compiler option to turn this optimization off (think breakpoints). Yup, that's what optimizations are. -- Dmitry Olshansky
Re: A modest proposal: eliminate template code bloat
On 08.04.2012 18:18, H. S. Teoh wrote: [snip] 1. Every time a function is generated (or pretty much any symbol) not only a size calculated but also a checksum* of it's data. (If we go for link-time optimization we should find a place to stick it to in the object file) We'd have to make sure the checksum doesn't end up in the final executable though, otherwise the bloat may negate any gains we've made. Easy the symbol size is in object file (obviously) but it surely not present in the executable. If I worth anything legacy formats have a plenty of slack space reserved for future. Nwo is the future :) [snip] 3. After any function was generated compiler checks an entry in the duplicate table that matches size, followed by matching checksum and only then (if required) doing a straight memcmp. If it happens that there is a match compiler just throws generated code away and _aliases_ it's symbol to that of a matched entry. (so there has to be an alias table if there isn't one already) I think you don't even need an alias table; IIRC the OS dynamic linker can easily handle symbols that have the same value (i.e. that point to the same place). All you have to do is to change the value of the duplicated symbols so that they all point to the same address. Nice to know. [...] Applicability: It's not only const-immutable bloat, it can be alleviated with inout. Yet there are plenty of places the exact same code is being generated: e.g. sealed containers of int vs uint, std.find on dchar[] vs uint[]/int[] an so on. In general, the coarse grained parametrization is the root of all evil and it is inevitable since we are just humans after all. I'm not sure I understand the last sentence there, but duplicate code elimination is definitely a big plus. It will also mean that we can use templates more freely without having to worry about template bloat. It's easy - define a template on type T. Code it up. Now how many times you did consider that e.g. you can parametrize on the size of the type instead of the type itself? I'm making a point that it's almost always the case with sealed containers of PODs for instance. (now multiply the argument for total number of parameters) Notes: 1. If we do checksum calculation on the fly during codegen it gets at virtually no cost as the data is in CPU data cache. Preliminary version can avoid hacking this part of backend though. 2. By _alias_ I mean the ability of compiler to emit references to a given symbol as if it was some other symbol (should be really straight forward). Like I said, I think this isn't even necessary, if the compiler can just generate the same value for the duplicated symbols. OK, I'm not much into how *exactly* linker works these days. I know the basics though. 4. Ironically the same exact thing works with any kind of immutable data structures. It looks like string pooling is superseded by this proposal. [...] Not really... string pooling can take advantage of overlapping (sub)strings, but I don't think you can do that with code. But I think your idea has a lot of merit. I'm for making linkers smarter than they currently are. Sorry, it's just me running ahead of train somewhat. Basically once this initial version is in place I have one cool refinement for it in mind. For now we just need to keep the hash function transitive and associative for the gods sake. 128/64bit checksum please ;) -- Dmitry Olshansky
Re: A modest proposal: eliminate template code bloat
On 08.04.2012 16:37, Marco Leise wrote: [snip] Template bloat could be especially important to 'fix' on embedded systems. I think I this idea largely formed years ago when I was working with c++ on 8bit micros. You won't believe the amount of code size one can save by using one separate generic save-it-all-then-load-it-all prolog/epilog for all functions (and esp ISRs). Let's wait till the bugs and important features are implemented or hack the compiler ourselves. Let's hack the compiler ;) BTW I think it should be possible to apply the idea on the IR level not on the actual machine code. -- Dmitry Olshansky
Re: std.benchmark ready for review. Manager sought after
Very good but minimum isn't a best guess. Personally I (and there will be a lot of such maniacs I suppose) will think that this (minimum) time can be significantly smaller than average time. So a parameter (probably with a default value) should be added. Something like enum of flags telling what we want to know. At least these looks usable: minTime, some mean time, maxTime, standardDeviation, graph (yes, good old ASCII art). Yes, graph is needed. -- Денис В. Шеломовский Denis V. Shelomovskij
Re: Small Buffer Optimization for string and friends
On 4/8/2012 7:53 AM, Andrei Alexandrescu wrote: Once anyone asks for .ptr a conservative copy will be made. That could get expensive. You cannot just point into the small string part, because that may only exist temporarily on the stack. There are some pathological cases for this.
Re: Small Buffer Optimization for string and friends
On Sun, Apr 08, 2012 at 05:35:50PM +0200, Jacob Carlborg wrote: On 2012-04-08 16:54, Andrei Alexandrescu wrote: On 4/8/12 5:45 AM, Jacob Carlborg wrote: [...] Just don't make the same mistake as with AA. The mistake with AAs was done long ago, but it was forced as AAs predated templates. Andrei I'm referring to the new template implementation of AAs that got reverted due everything breaking, if I recall correctly. [...] Huh? When was this? T -- I suspect the best way to deal with procrastination is to put off the procrastination itself until later. I've been meaning to try this, but haven't gotten around to it yet. -- swr
Re: readonly storage class
On 04/08/2012 06:09 PM, Benjamin Thaut wrote: Am 08.04.2012 11:49, schrieb Simen Kjaeraas: On Sun, 08 Apr 2012 11:39:03 +0200, Benjamin Thaut c...@benjamin-thaut.de wrote: I'll come up with a example. But if what you say is true, why is immutable not implicitly convertible to const? It is. Thanks, then this is a misunderstanding on my side, and this topic is irrelevant. But what about calling const methods on immutable objects? That works.
Re: Small Buffer Optimization for string and friends
On Sun, Apr 08, 2012 at 10:00:37AM -0500, Andrei Alexandrescu wrote: On 4/8/12 8:54 AM, H. S. Teoh wrote: Sorry, I've been busy at work and haven't had too much free time to work on it. The current code is available on github: https://github.com/quickfur/New-AA-implementation Thanks for the update! Let me reiterate this is important work for many reasons, so it would be great if you got around to pushing it through. I'll try my best to work on it when I have free time. Do you think someone else in the community could help you? [...] Well, I put the code up on github for a reason. :-) I'll only be too glad to have someone else contribute. The code itself isn't all that complicated; it's basically just a facsimile of aaA.d with many bugs fixed due to the fact that we now have direct access to key/value types. T -- Life is all a great joke, but only the brave ever get the point. -- Kenneth Rexroth
Re: Small Buffer Optimization for string and friends
On 4/8/12 12:05 PM, Walter Bright wrote: On 4/8/2012 7:53 AM, Andrei Alexandrescu wrote: Once anyone asks for .ptr a conservative copy will be made. That could get expensive. You cannot just point into the small string part, because that may only exist temporarily on the stack. There are some pathological cases for this. As I mentioned, the first call to .ptr changes representation, thus making the allocation that the optimization had saved. Things are not worse off than before. Andrei
Re: A modest proposal: eliminate template code bloat
On Sun, Apr 08, 2012 at 08:45:19PM +0400, Dmitry Olshansky wrote: On 08.04.2012 18:18, H. S. Teoh wrote: [snip] We'd have to make sure the checksum doesn't end up in the final executable though, otherwise the bloat may negate any gains we've made. Easy the symbol size is in object file (obviously) but it surely not present in the executable. If I worth anything legacy formats have a plenty of slack space reserved for future. Nwo is the future :) I agree! [...] Applicability: It's not only const-immutable bloat, it can be alleviated with inout. Yet there are plenty of places the exact same code is being generated: e.g. sealed containers of int vs uint, std.find on dchar[] vs uint[]/int[] an so on. In general, the coarse grained parametrization is the root of all evil and it is inevitable since we are just humans after all. I'm not sure I understand the last sentence there, but duplicate code elimination is definitely a big plus. It will also mean that we can use templates more freely without having to worry about template bloat. It's easy - define a template on type T. Code it up. Now how many times you did consider that e.g. you can parametrize on the size of the type instead of the type itself? I'm making a point that it's almost always the case with sealed containers of PODs for instance. (now multiply the argument for total number of parameters) Yeah, that's what I was thinking of. This would be a very big gain for the new AA implementation, for example. I wouldn't have to worry so much about template bloat if most of the instantiations are going to get merged anyway. :-) [...] Not really... string pooling can take advantage of overlapping (sub)strings, but I don't think you can do that with code. But I think your idea has a lot of merit. I'm for making linkers smarter than they currently are. Sorry, it's just me running ahead of train somewhat. Basically once this initial version is in place I have one cool refinement for it in mind. For now we just need to keep the hash function transitive and associative for the gods sake. 128/64bit checksum please ;) [...] And what would the cool refinement be? :-) T -- It's bad luck to be superstitious. -- YHL
Re: Discussion on Go and D
On 4/6/2012 9:07 AM, Andrei Alexandrescu wrote: A few more samples of people's perception of the two languages: http://news.ycombinator.com/item?id=3805302 At least we don't have this issue: http://news.ycombinator.com/item?id=3814020 The D gc allocates smallish chunks as required, not one giant virtual arena.