from:"Jon Degenhardt via Digitalmars\-d\-announce"

Re: Silicon Valley D Meetup - March 18, 2021 - "Templates in the D Programming Language" by Ali Çehreli

2021-03-21 Thread Jon Degenhardt via Digitalmars-d-announce


On Friday, 19 March 2021 at 17:10:27 UTC, Ali Çehreli wrote:

Jon mentioned how PR 7678 reduced the performance of 
std.regex.matchOnce. After analyzing the code we realized that 
the performance loss must be due to two delegate context 
allocations:


https://github.com/dlang/phobos/pull/7678/files#diff-269abc020de3a951eaaa5b8eca5a0700ba8b298767c7a64f459e74e1531a80aeR825

One delegate is 'matchOnceImp' and the other one is the 
anonymous delegate created on the return expression.


We understood that 'matchOnceImp' could not be a nested 
function because of an otherwise useful rule: the name of the 
nested function alone would *call* that function instead of 
being a symbol for it. That is not the case for a local 
delegate variable, so that's why 'matchOnceImp' exists as a 
delegate variable there.


Then there is the addition of the 'pure' attribute to it. 
Fine...


After tinkering with the code, we realized that the same effect 
can be achieved with a static member function of a static 
struct, which would not allocate any delegate context. I add 
@nogc to the following code to prove that point. The following 
code is even simpler than Jon and I came up with yesterday.


[... Code snippet removed ...]

There: we injected @trusted code inside a @nogc @safe function.

Question to others: Did we understand the reason for the 
convoluted code in that PR fully? Is the above method really a 
better solution?


I submitted PR 7902 (https://github.com/dlang/phobos/pull/7902) 
to address this. I wasn't able to use the version Ali showed in 
the post, but the PR does use what is essentially the same idea 
identified at the D Meetup. It is a performance regression, and 
is a bit more nuanced than would be ideal. Comments and review 
would be appreciated.


--Jon

Re: Article: Why I use the D programming language for scripting

2021-01-31 Thread Jon Degenhardt via Digitalmars-d-announce


On Sunday, 31 January 2021 at 20:36:43 UTC, aberba wrote:

It's finally out!

https://opensource.com/article/21/1/d-scripting


Very nice! Clearly I'm not taking enough advantage of scripting 
capabilities!


--Jon

Re: Github Actions now support D out of the box!

2020-10-04 Thread Jon Degenhardt via Digitalmars-d-announce


On Friday, 21 August 2020 at 02:03:40 UTC, Mathias LANG wrote:

Hi everyone,
Almost a year ago, Ernesto Castelloti (@ErnyTech) submitted a 
PR for Github's "starter-workflow" to add support for D out of 
the box (https://github.com/actions/starter-workflows/pull/74). 
It was in a grey area for a while, as Github was trying to come 
up with a policy for external actions. I ended up picking up 
the project, after working with actions extensively for my own 
projects and the dlang org, and my PR was finally merged 
yesterday 
(https://github.com/actions/starter-workflows/pull/546).


A thank you to everyone who helped put this together. I just 
started using it, and it works quite well. It's a very valuable 
tool to have!


--Jon

Re: Github Actions now support D out of the box!

2020-08-21 Thread Jon Degenhardt via Digitalmars-d-announce


On Friday, 21 August 2020 at 02:03:40 UTC, Mathias LANG wrote:

[...]


Thanks for the effort on this, I'll definitely be checking it out!

--Jon

Re: tsv-utils 2.0 release: Named field support

2020-07-29 Thread Jon Degenhardt via Digitalmars-d-announce


On Tuesday, 28 July 2020 at 15:57:57 UTC, bachmeier wrote:
Thanks for your work. I've recommended tsv-utils to some 
students for their data analysis. It's a nice substitute for a 
database depending on what you're doing. It really helps that 
you store can store your "database" in repo like any other text 
file. I'm going to be checking out the new version soon.


Thanks for the support and for checking out tools! Much 
appreciated.

Re: tsv-utils 2.0 release: Named field support

2020-07-27 Thread Jon Degenhardt via Digitalmars-d-announce


On Monday, 27 July 2020 at 14:32:27 UTC, aberba wrote:

On Sunday, 26 July 2020 at 20:28:56 UTC, Jon Degenhardt wrote:
I'm happy to announce a new major release of eBay's TSV 
Utilities. The 2.0 release supports named field selection in 
all of the tools, a significant usability enhancement.


So I didn't checked it out until today and I'm really impressed 
about the documentation, presentation and just about everything.


Thanks for the kind words, and for taking the time to check out 
the toolkit. Both are very much appreciated!

tsv-utils 2.0 release: Named field support

2020-07-26 Thread Jon Degenhardt via Digitalmars-d-announce


Hi all,

I'm happy to announce a new major release of eBay's TSV 
Utilities. The 2.0 release supports named field selection in all 
of the tools, a significant usability enhancement.


For those not familiar, tsv-utils is a set of command line tools 
for manipulating tabular data files of the type commonly found in 
machine learning and data mining environments. Filtering, 
statistics, sampling, joins, etc. The tools are patterned after 
traditional Unix common line tools like 'cut', 'grep', 'sort', 
etc., and are intended to work with these tools. Each tool is a 
standalone executable. Most people will only care about a subset 
of the tools. It is not necessary to learn the entire toolkit to 
get value from the tools.


The tools are all written in D and are the fastest tools of their 
type available (benchmarks are on the GitHub repository).


Previous versions of the tools referenced fields by field number, 
same as traditional Unix tools like 'cut'. In version 2.0, 
tsv-utils tools take fields either by field number or by field 
name, for files with header lines. A few examples using 
'tsv-select', a tool similar to 'cut' that also supports field 
reordering and dropping fields:


$ # Field numbers: Output fields 2 and 1, in that order.
$ tsv-select -f 2,1 data.tsv

$ # Field names: Output the 'Name' and 'RecordNum' fields.
$ tsv-select -H -f Name,RecordNum data.tsv

$ # Drop the 'Color' field, keep everything else.
$ tsv-select -H --exclude Color file.tsv

$ # Drop all the fields ending in '_time'
$ tsv-select -H -e '*_time' data.tsv

More information is available on the tsv-utils GitHub repository, 
including documentation and pre-built binaries: 
https://github.com/eBay/tsv-utils


--Jon

Re: On the D Blog: Lomuto's Comeback

2020-05-17 Thread Jon Degenhardt via Digitalmars-d-announce


On Thursday, 14 May 2020 at 13:26:23 UTC, Mike Parker wrote:
After reading a paper that grabbed his curiosity and wouldn't 
let go, Andrei set out to determine if Lomuto partitioning 
should still be considered inferior to Hoare for quicksort on 
modern hardware. This blog post details his results.


Blog:
https://dlang.org/blog/2020/05/14/lomutos-comeback/

Reddit:
https://www.reddit.com/r/programming/comments/gjm6yp/lomutos_comeback_quicksort_partitioning/

HN:
https://news.ycombinator.com/item?id=23179160


Got posted again to Hacker News earlier today. Currently at 
position 5.

Re: Our HOPL IV submission has been accepted!

2020-02-28 Thread Jon Degenhardt via Digitalmars-d-announce

On Saturday, 29 February 2020 at 01:00:40 UTC, Andrei 
Alexandrescu wrote:
Walter, Mike, and I are happy to announce that our paper 
submission "Origins of the D Programming Language" has been 
accepted at the HOPL IV (History of Programming Languages) 
conference.


https://hopl4.sigplan.org/track/hopl-4-papers

Getting a HOPL paper in is quite difficult, and an important 
milestone for the D language. We'd like to thank the D 
community which was instrumental in putting the D language on 
the map.


The HOPL IV conference will take place in London right before 
DConf. With regard to travel, right now Covid-19 fears are on 
everybody's mind; however, we are hopeful that between now and 
then the situation will improve.


Congrats! Indeed a meaningful accomplishment.

New graphs for tsv-utils performance benchmarks

2020-01-29 Thread Jon Degenhardt via Digitalmars-d-announce

A small thing - Many people who have seen the performance 
benchmarks for eBay's TSV Utilities find the text table format 
I've used in the past hard to read. Me too. So, I finally 
generated more traditional graphical representations for the 2018 
benchmark results.


The graphs are here: 
https://github.com/eBay/tsv-utils/blob/master/docs/Performance.md#2018-benchmark-summary


There are no new benchmarks, just new visualizations of the 
results.


For folks who not familiar with these benchmarks - This is part 
of performance studies done by comparing eBay's TSV Utilities 
with a number of command line tools providing similar 
functionality (e.g. awk). The results shown were presented at 
DConf 2018.


* Details of the performance study - 
https://github.com/eBay/tsv-utils/blob/master/docs/Performance.md
* DConf 2018 talk slides - 
https://github.com/eBay/tsv-utils/blob/master/docs/dconf2018.pdf

Re: LDC 1.17.0-beta1

2019-08-10 Thread Jon Degenhardt via Digitalmars-d-announce


On Saturday, 10 August 2019 at 15:51:28 UTC, kinke wrote:

Glad to announce the first beta for LDC 1.17:
...
Please help test, and thanks to all contributors!


No changes in my standard performance tests (good). All 
functional tests pass as well.

Re: bool (was DConf 2019 AGM Livestream)

2019-05-12 Thread Jon Degenhardt via Digitalmars-d-announce


On Sunday, 12 May 2019 at 17:08:49 UTC, Jonathan M Davis wrote:

... snip ...
Fortunately, in the grand scheme of things, while this issue 
does matter, it's still much smaller than almost all of the 
issues that we have to worry about and consider having DIPs for.


Personally, I'm not at all happy that this DIP was rejected, 
but I think that continued debate on it is a waste of 
everyone's time.


Agreed. I too have never liked numeric values equated to 
true/false, in any programming language. However, it is very 
common. And, relative to other the big ticket items on the table, 
of minor importance. Changing the current behavior won't 
materially affect the usability of D or its future. This is a 
case where the best course is to make a decision move on.


--Jon

Re: eBay's TSV Utilities status update

2019-05-03 Thread Jon Degenhardt via Digitalmars-d-announce


On Friday, 3 May 2019 at 03:54:14 UTC, James Blachly wrote:

On 4/29/19 11:23 AM, Jon Degenhardt wrote:

An update on changes to this tool-set over the last year.

...
Thank you for this, and thanks for your blog post of a couple 
of years ago, which I referred to many times while learning D 
and writing fast(er) CLI tools.


Looking forward to trying Steve's iopipe as well as your 
bufferedByLineReader.


James


Thanks for the kind words James!

eBay's TSV Utilities status update

2019-04-29 Thread Jon Degenhardt via Digitalmars-d-announce


An update on changes to this tool-set over the last year.

For those not familiar, tsv-utils are a set of command tools for 
manipulating large tabular data files. Files of numeric and text 
data common in machine learning and data mining environments. 
Filtering, statistics, sampling, joins, and more. The tools are 
intended for large files, larger than ideal for loading in-memory 
in tools like R or Pandas, but not so big as to necessitate 
moving to distributed compute environments. The tools are quite 
fast, the fastest of their kind available.


Besides being real tools, tsv-utils have also provided an 
environment for exploring the D programming language and the D 
ecosystem.


In past year there have been two main areas of work.

One area is the sampling and shuffling facilities provided by the 
tsv-sample program. New sampling methods are available and 
performance has been improved. tsv-sample is very similar to the 
excellent GNU shuf tool, but supports sampling methods not 
available in shuf. Sampling is a rich and diverse area, and the 
tsv-sample code is perhaps the most algorithmically interesting 
the tool-set.


The other main update is improved I/O read performance in many of 
the tools. This is from developing a buffered version of byLine. 
It is especially effective for skinny files (short lines). Most 
of the tools saw performance gains of 10-40%.


One of the earlier performance improvements came from buffering 
output lines. Combined, the line-by-line read-write performance 
is quite a bit faster than what is available in Phobos. The 
iopipe / std.io packages (Steve Schveighoff, Martin Nowak) are 
faster still, these are the place to go for really high 
performance. (See the links below for a benchmark report.)


Links:
* tsv-utils repo: https://github.com/eBay/tsv-utils
* tsv-sample user docs: 
https://github.com/eBay/tsv-utils/blob/master/docs/ToolReference.md#tsv-sample-reference
* tsv-sample code docs: 
https://tsv-utils.dpldocs.info/tsv_utils.tsv_sample.html
* Performance benchmarks on line-oriented I/O facilities: 
https://github.com/jondegenhardt/dcat-perf/issues/1

Re: NEW Milestone: 1500 packages at code.dlang.org

2019-02-07 Thread Jon Degenhardt via Digitalmars-d-announce

On Thursday, 7 February 2019 at 18:02:21 UTC, H. S. Teoh wrote:
On Thu, Feb 07, 2019 at 05:06:09PM +, Seb via 
Digitalmars-d-announce wrote:

On Thursday, 7 February 2019 at 16:40:08 UTC, Anonymouse wrote:
> 
> What was the word on the autotester (or similar) testing 
> popular

> packages as part of the test suite?

This is been done since more than a year now for the ~50 most 
popular packages: https://buildkite.com/dlang

In my opinion this is one of the main reasons why the last 
releases were so successful (=almost no regressions).

That's awesome. This is the way to go.  Congrats to everyone 
who helped pull this off.

T

Agreed! This is a really nice bit of work that's come out of the 
D ecosystem.

Re: D-lighted, I'm Sure

2019-01-19 Thread Jon Degenhardt via Digitalmars-d-announce


On Friday, 18 January 2019 at 14:29:14 UTC, Mike Parker wrote:
Not long ago, in my retrospective on the D Blog in 2018, I 
invited folks to write about their first impressions of D. Ron 
Tarrant, who you may have seen in the Lear forum, answered the 
call. The result is the latest post on the blog, the first 
guest post of 2019. Thanks, Ron!


As a reminder, I'm still looking for new-user impressions and 
guest posts on any D-related topic. Please contact me if you're 
interested. And don't forget, there's a bounty for guest posts, 
so you can make a bit of extra cash in the process.


The blog:
https://dlang.org/blog/2019/01/18/d-lighted-im-sure/

Reddit:
https://www.reddit.com/r/programming/comments/ahawhz/dlighted_im_sure_the_first_two_months_with_d/


Nicely done. Very enjoyable, thanks for publishing this!

--Jon

Re: My Meeting C++ Keynote video is now available

2019-01-12 Thread Jon Degenhardt via Digitalmars-d-announce

On Saturday, 12 January 2019 at 15:51:03 UTC, Andrei Alexandrescu 
wrote:

https://youtube.com/watch?v=tcyb1lpEHm0

If nothing else please watch the opening story, it's true and 
quite funny :o).


Now as to the talk, as you could imagine, it touches on another 
language as well...



Andrei


Very nice. I especially liked how design by introspection was 
contrasted with other approaches and how the constexpr discussion 
fit into the overall theme.


--Jon

Re: DCD, D-Scanner and DFMT : new year edition

2018-12-31 Thread Jon Degenhardt via Digitalmars-d-announce


On Monday, 31 December 2018 at 07:56:00 UTC, Basile B. wrote:
DCD [1] 0.10.2 comes with bugfixes and small API changes. DFMT 
[2] and D-Scanner [3] with bugfixes too and all of the three 
products are based on d-parse 0.10.z, making life easier and 
the libraries versions more consistent for the D IDE and D IDE 
plugins developers.


[1] https://github.com/dlang-community/DCD/releases/tag/v0.10.2
[2] https://github.com/dlang-community/dfmt/releases/tag/v0.9.0
[3] 
https://github.com/dlang-community/D-Scanner/releases/tag/v0.6.0


Thanks for the ongoing work on DCD et al!

Re: Iain Buclaw at GNU Tools Cauldron 2018

2018-10-07 Thread Jon Degenhardt via Digitalmars-d-announce


On Monday, 8 October 2018 at 05:12:03 UTC, Joakim wrote:

On Sunday, 7 October 2018 at 15:41:43 UTC, greentea wrote:

Date: September 7 to 9, 2018.
Location: Manchester, UK

GDC - D front-end GCC

https://www.youtube.com/watch?v=iXRJJ_lrSxE


Thanks for the link, just watched the whole video. The first 
half-hour sets the standard as an intro to the language, as 
only a compiler developer other than the main implementer could 
give, ie someone with fresh eyes.


I loved that Iain started off with a list of real-world 
projects. That's a mistake a lot of tech talks make, ie not 
motivating _why_ anybody should care about their tech and 
simply diving into the tech itself. I hadn't heard some of that 
info either, great way to begin.


I agree, a very nice talk, including the way the motivation part 
of was handled. I especially liked the example of the group who 
typically used Python for rapid prototyping, then re-wrote in C++ 
for production, who upon trying D for a prototype, were 
pleasantly surprised it was performant enough for production.

eBay's TSV Utilities repository renamed

2018-07-15 Thread Jon Degenhardt via Digitalmars-d-announce

I've renamed the TSV Utilities Github repository from 
eBay/tsv-utils-dlang to eBay/tsv-utils. This is to better reflect 
the functional nature of the tools.


Links pointing to the old github repo will be redirected to the 
new repo. This includes git operations like clone, etc., so 
Project Tester should not be affected. Let me know if any issues 
surface.


--Jon

Re: Driving Continuous Improvement in D

2018-06-02 Thread Jon Degenhardt via Digitalmars-d-announce


On Saturday, 2 June 2018 at 07:23:42 UTC, Mike Parker wrote:
In this post for the D Blog, Jack Stouffer details how dscanner 
is used in the Phobos development process to help improve code 
quality and fight entropy.


The blog:
https://dlang.org/blog/2018/06/02/driving-continuous-improvement-in-d/

reddit:
https://www.reddit.com/r/programming/comments/8nyzmk/driving_continuous_improvement_in_d/


Nice post. I haven't tried dscanner on my code, but I plan to 
now. It looks like the documentation on the dscanner repo is 
pretty good. If you think it's ready for wider adoption, consider 
adding a couple lines to the blog post indicating that folks who 
want to try it will find instructions in the repo.

Re: iopipe v0.0.4 - RingBuffers!

2018-05-11 Thread Jon Degenhardt via Digitalmars-d-announce

On Friday, 11 May 2018 at 15:44:04 UTC, Steven Schveighoffer 
wrote:

On 5/10/18 7:22 PM, Steven Schveighoffer wrote:

Shameful note: Macos grep is BSD grep, and is not NEARLY as 
fast as GNU grep, which has much better performance (and is 2x 
as fast as iopipe_search on my Linux VM, even when printing 
line numbers).


Yeah, the MacOS default versions of the Unix text processing 
tools are really slow. It's worth installing the GNU versions if 
doing performance comparisons on MacOS, or because you work with 
large files. Homebrew and MacPorts both have the GNU versions. 
Some relevant packages: coreutils, grep, gsed (sed), gawk (awk).


Most tools are in coreutils. Many will be installed with a 'g' 
prefix by default, leaving the existing tools in place. e.g. 
'cut' will be installed as 'gcut' unless specified otherwise.


--Jon

Re: Things to do in Munich

2018-05-01 Thread Jon Degenhardt via Digitalmars-d-announce


On Monday, 30 April 2018 at 19:57:10 UTC, Seb wrote:
As I live in Munich and there have been a few threads about 
things to do in Munich, I thought I quickly share a few 
selected activities + current events.


- over 80 museums (best ones: Museum Brandhost, Pinakothek der 
Moderne, Haus der Kunst, Deutsches Museum, Glyptothek, potato 
museum, NS-


Most of the museums are closed today (public holiday). Check 
before you go. However, the surfers are out!


—Jon

Re: Project Highlight: The D Community Hub

2018-02-18 Thread Jon Degenhardt via Digitalmars-d-announce


On Saturday, 17 February 2018 at 12:56:34 UTC, Mike Parker wrote:
In case you aren't aware of the dlang-community organization at 
GitHub, it's an umbrella group of contributors working to keep 
certain D projects alive and updated. Sebastian Wilzbach filled 
me in on some details for the latest Project Highlight on the 
blog.


blog:
https://dlang.org/blog/2018/02/17/project-highlight-the-d-community-hub/

reddit:
https://www.reddit.com/r/programming/comments/7y6gw1/the_d_community_hub_an_umbrella_group_for_d/


Very nice article. There are more projects there than I had 
realized!

Re: TSV Utilities release with LTO and PGO enabled

2018-01-17 Thread Jon Degenhardt via Digitalmars-d-announce

On Wednesday, 17 January 2018 at 21:49:52 UTC, Johan Engelen 
wrote:
On Wednesday, 17 January 2018 at 04:37:04 UTC, Jon Degenhardt 
wrote:


Clearly personal judgment played a role. However, the tools 
are reasonably task focused, and I did take basic steps to 
ensure the benchmark data and tests were separate from the 
training data/tests. For these reasons, my confidence is good 
that the results are reasonable and well founded.


Great, thanks for the details, I agree.
Hope it's useful for others to see these details.


Thanks Johan, much appreciated. :)

(btw, did you also check the performance gains when using the 
profile of the benchmark itself, to learn about the upper-bound 
of PGO for your program?)


I'll merge the IR PGO addition into LDC master soon. Don't know 
what difference it'll make.


No, I didn't do an upper-bounds check, that's a good idea. I plan 
to test the IR based PGO when it's available, I'll run an 
upper-bounds check as part of it.

Re: TSV Utilities release with LTO and PGO enabled

2018-01-16 Thread Jon Degenhardt via Digitalmars-d-announce

On Tuesday, 16 January 2018 at 22:04:52 UTC, Johan Engelen wrote:
Because PGO optimizes for the given profile, it would help a
lot if you clarified how you do your PGO benchmarking. What
kind of test load profile you used for optimization and what
test load you use for the time measurement.

The profiling used is checked into the repo and run as part of a
PGO build, it is available for inspection. The benchmarks used
for deltas are also documented, they the ones used in the
benchmark comparison to similar tools done in March 2017. This
report is in the repo
(https://github.com/eBay/tsv-utils-dlang/blob/master/docs/Performance.md).

However, it's hard to imagine anyone perusing the repo for this
stuff, so I'll try to summarize what I did below.

Benchmarks - Six different tests of rather different but common
operations run on large data files. The six tests were chosen
because for each I was able to find at least three other tools,
written in native compiled languages, with similar functionality.
There are other valuable benchmarks, but I haven't published them.

Profiling - Profiling was developed separately for each tool. For
each I generated several data files with data representative of
typical uses cases. Generally numeric or text data in several
forms and distributions. The data was unrelated to the data used
in benchmarks, which is from publicly available machine learning
data sets. However, personal judgement was used in the generation
of the data sets, so it's not free from bias.

After generating the data, I generated a set of run options
specific to each tool. As an example, tsv-filter selects data
file lines based on various numeric and text criteria (e.g.
less-than). There are a bit over 50 comparison operations, plus a
few meta operations. The profiling runs ensure all the operations
are run at least once, but that the most important overweighted.
The ldc.profile.resetAll call was used to exclude all the initial
setup code (command line argument processing). This was nice
because it meant the data files could be small relative to
real-world sets, and it runs fast enough to do at part of the
build step (ie. on Travis-CI).

Look at
https://github.com/eBay/tsv-utils-dlang/tree/master/tsv-filter/profile_data to see a concrete example (tsv-filter). In that directory are five data files and a shell script that runs the commands and collects the data.

This was done for four of the tools covering five of the
benchmarks. I skipped one of the tools (tsv-join), as it's harder
to come up with a concise set of profile operations for it.

I then ran the standard benchmarks I usually report on in various
D venues.

Clearly personal judgment played a role. However, the tools are
reasonably task focused, and I did take basic steps to ensure the
benchmark data and tests were separate from the training
data/tests. For these reasons, my confidence is good that the
results are reasonable and well founded.

--Jon

Re: TSV Utilities release with LTO and PGO enabled

2018-01-15 Thread Jon Degenhardt via Digitalmars-d-announce

On Tuesday, 16 January 2018 at 00:19:24 UTC, Martin Nowak wrote:
On Sunday, 14 January 2018 at 23:18:42 UTC, Jon Degenhardt
wrote:
Combined, LTO and PGO resulted in performance improvements
greater than 25% on three of my standard six benchmarks, and
five of the six improved at least 8%.

Yay, I'm usually seeing double digit improvements for PGO
alone, and single digit improvements for LTO. Meaning PGO has
more effect even though LTO seems to be the more hyped one.

Have you bothered benchmarking them separately?

Last spring I made a few quick tests of both separately. That was
just against the app code, without druntime/phobos. Saw some
benefit from LTO, mainly one of the tools, and not much from PGO.

More recently I tried LTO standalone and LTO plus PGO, both
against app code and druntime/phobos, but not PGO standalone. The
LTO benchmarks are here:
https://github.com/eBay/tsv-utils-dlang/blob/master/docs/dlang-meetup-14dec2017.pdf. I've haven't published the LTO + PGO benchmarks.

The takeaway from my tests is that LTO and PGO will benefit
different apps differently, perhaps in ways not easily predicted.
One of my tools benefited primarily from PGO, two primarily from
LTO, and one materially from both. So, it is worth trying both.

For both, the big win was from optimizing across app code and
libs (druntime/phobos in my case). It'd be interesting to see if
other apps see similar behavior, either with phobos/druntime or
other libraries, perhaps libraries from dub dependencies.

TSV Utilities release with LTO and PGO enabled

2018-01-14 Thread Jon Degenhardt via Digitalmars-d-announce

I just released a new version of eBay's TSV Utilities. The cool 
thing about the release is not about changes in toolkit, but that 
it was possible to build everything using LDC's support for Link 
Time Optimization (LTO) and Profile Guided Optimization (PGO). 
This includes running the optimizations on both the application 
code and the D standard libraries (druntime and phobos). Further, 
it was all doable on Travis-CI (Linux and MacOS), including 
building release binaries available from the GitHub release page.


Combined, LTO and PGO resulted in performance improvements 
greater than 25% on three of my standard six benchmarks, and five 
of the six improved at least 8%.


Release info: 
https://github.com/eBay/tsv-utils-dlang/releases/tag/v1.1.16

Re: DLang docker images for CircleCi 2.0

2018-01-05 Thread Jon Degenhardt via Digitalmars-d-announce


On Wednesday, 3 January 2018 at 13:12:48 UTC, Seb wrote:

tl;dr: you can now use special D docker images for CircleCi 2.0

[snip

PS: I'm aware of Stefan Rohe's great D Docker images [1], but 
this Docker image is built on top of the specialized CircleCi 
image (e.g. for their SSH login).


One useful characteristic of Stefan's images is that the 
Dockerhub pages include the Dockerfile and github repository 
links. I don't know what it takes to include them. It does make 
it easier to see exactly what the configuration is, find the 
repo, and even create PRs against them. Would be useful if they 
can be added to the CircleCI image pages.


My interest in this case - I use Stefan's LDC image in Travis-CI 
builds. Building the runtime libraries with LTO/PGO requires the 
ldc-build-runtime tool, which in turn requires a few additional 
things in the docker image, like cmake or ninja. I was interested 
if they might have been included in the CircleCI images as well. 
(Doesn't appear so.)

Re: Article: Finding memory bugs in D code with AddressSanitizer

2017-12-26 Thread Jon Degenhardt via Digitalmars-d-announce


On Monday, 25 December 2017 at 17:03:37 UTC, Johan Engelen wrote:
I've been writing this article since August, and finally found 
some time to finish it:


http://johanengelen.github.io/ldc/2017/12/25/LDC-and-AddressSanitizer.html

"LDC comes with improved support for Address Sanitizer since 
the 1.4.0 release. Address Sanitizer (ASan) is a runtime memory 
write/read checker that helps discover and locate memory access 
bugs. ASan is part of the official LDC release binaries; to use 
it you must build with -fsanitize=address. In this article, 
I’ll explain how to use ASan, what kind of bugs it can find, 
and what bugs it will be able to find in the (hopefully near) 
future."


Nice article. Main question / comment is about the need for 
blacklisting D standard libraries (druntime/phobos). If someone 
wants to try ASan out on their own code, can they start by 
ignoring the D standard libraries? And, for programs that use 
druntime/phobos, will this be effective? If I understand the 
post, the answer is "yes", but I think it could be more explicit.


Second comment is related - If the reader was to try 
instrumenting druntime/phobos along with their own code, how much 
effort should be expected to correctly blacklist druntime/phobos 
code? Would many programs have smooth sailing if they took the 
blacklist published in the post? Or is this early stage enough 
that some real effort should be expected?


Also, if the blacklist file in the post represents a meaningful 
starting point, perhaps it makes sense to check it in and 
distribute it. This would provide a place for contributors to 
start making improvements.

Re: Silicon Valley D Meetup - December 14, 2017 - "Experimenting with Link Time Optimization" by Jon Degenhardt

2017-12-20 Thread Jon Degenhardt via Digitalmars-d-announce

On Saturday, 16 December 2017 at 11:52:37 UTC, Johan Engelen 
wrote:

Clearly very interested in what your PGO testing will show. :-)


Early returns on adding PGO on top of LTO (first five benchmarks 
in the slide deck, tsv-join not tested):

* Two meaningful improvements:
  - csv2tsv: Linux: 8%; macOS: 33%
  - tsv-summarize: Linux: 6%; macOS: 11%
* Minor improvements on the other three benchmarks (< 5%)

Overall, for LDC 1.5, the improvements going from a normal 
optimized build to one combining LTO and PGO ranged from on 8-45% 
Linux, and 6-57% on macOS. (First five benchmarks, excluding 
tsv-join). Impressive!


--Jon

Re: Silicon Valley D Meetup - December 14, 2017 - "Experimenting with Link Time Optimization" by Jon Degenhardt

2017-12-16 Thread Jon Degenhardt via Digitalmars-d-announce

On Saturday, 16 December 2017 at 11:52:37 UTC, Johan Engelen 
wrote:

On Friday, 15 December 2017 at 03:08:35 UTC, Ali Çehreli wrote:

This should be live now:

  http://youtu.be/e05QvoKy_8k


Great! I've added some comments there, pasted here:


Fantastic feedback! Fills in some really important details.

Can't wait to see the results of LTO on Weka.io's (LARGE) 
applications. Work in progress...!


Agreed. It'd be great to see the experience of a few more apps.

Could you add the reference links in the comment section there 
too? (can't click on blue links in the video ;-)


Done. Thanks for pointing this out. I also updated the posted 
slide deck so that the hyperlinks work after downloading it. 
(They still aren't clickable in the GitHub inline viewer.)



Clearly very interested in what your PGO testing will show. :-)


Yes, should be interesting. Promising results in one benchmark. 
And sigh, I forgot to mention the opportunity you mentioned for 
someone to participate: Adding LLVM's IR-level PGO to the LDC 
compiler. Sounds pretty cool.

Re: Silicon Valley D Meetup - December 14, 2017 - "Experimenting with Link Time Optimization" by Jon Degenhardt

2017-12-15 Thread Jon Degenhardt via Digitalmars-d-announce


On Friday, 15 December 2017 at 03:08:35 UTC, Ali Çehreli wrote:

This should be live now:

  http://youtu.be/e05QvoKy_8k

Ali

On 11/21/2017 11:58 AM, Ali Çehreli wrote:

Meetup page: 
https://www.meetup.com/D-Lang-Silicon-Valley/events/245288287/


LDC[1], the LLVM-based D compiler, has been adding Link Time 
Optimization capabilities over the last several releases. [...]


This talk will look at the results of applying LTO to one set 
of applications, eBay's TSV utilities[2]. [...]


Jon Degenhardt is a member of eBay's Search Science team.
[...] D quickly became his favorite programming language, one 
he uses whenever he can.


Ali

[1] 
https://github.com/ldc-developers/ldc#ldc--the-llvm-based-d-compiler


[2] 
https://dlang.org/blog/2017/05/24/faster-command-line-tools-in-d/


Slides from the talk: 
https://github.com/eBay/tsv-utils-dlang/blob/master/docs/dlang-meetup-14dec2017.pdf

Re: LDC 1.5.0

2017-11-05 Thread Jon Degenhardt via Digitalmars-d-announce


On Friday, 3 November 2017 at 17:17:04 UTC, kinke wrote:

Hi everyone,

on behalf of the LDC team, I'm glad to finally officially 
announce LDC 1.5. The highlights of this version in a nutshell:


* Based on D 2.075.1.
* Polished LLVM 5.0 support (now also used for the prebuilt 
release packages).

* Prebuilt ARM-Linux package available again.
* New command-line option `-linker` and ~25 new advanced ones 
for codegen fine-tuning.

* Bugfixes, as always.

Full release log and downloads: 
https://github.com/ldc-developers/ldc/releases/tag/v1.5.0


Thanks to all contributors!

[LDC master is at v2.076.1, so LDC 1.6 won't take long.]


Great work by the LDC team! Thanks to all the LTO work in 1.4 and 
1.5, the Travis-CI builds of the eBay TSV utilities are LTO 
enabled for Phobos & Druntime as well as the application code. 
This is for both Linux and OS X builds. Couldn't do that before 
the LDC 1.5 release.


The OS X executables are materially faster with the end-to-end 
LTO support. I haven't benchmarked the Linux versions yet. It 
would be very interesting to have get benchmark numbers from 
other apps, especially those making material use of phobos.

Re: LDC 1.4.0-beta1

2017-08-27 Thread Jon Degenhardt via Digitalmars-d-announce


On Saturday, 26 August 2017 at 22:35:11 UTC, kinke wrote:

Hi everyone,

on behalf of the LDC team, I'm glad to announce LDC 
1.4.0-beta1. The highlights of version 1.4 in a nutshell:


* Based on D 2.074.1.
* Shipping with ldc-build-runtime, a small D tool to easily 
(cross-)compile the runtime libraries yourself.

* Full Android support, incl. emulated TLS.
* Improved support for AddressSanitizer and libFuzzer. The 
libraries are shipped with the prebuilt Linux x86_64 and OSX 
packages.
* Prebuilt Linux x86_64 package shipping with LTO plugin, 
catching up with the OSX package.


Full release log and downloads: 
https://github.com/ldc-developers/ldc/releases/tag/v1.4.0-beta1


Thanks to everybody contributing!


Wow, this looks fantastic, congrats!

--Jon

Re: Compile-Time Sort in D

2017-06-07 Thread Jon Degenhardt via Digitalmars-d-announce


On Wednesday, 7 June 2017 at 20:59:50 UTC, Joakim wrote:

On Tuesday, 6 June 2017 at 01:08:45 UTC, Mike Parker wrote:

On Monday, 5 June 2017 at 17:54:05 UTC, Jon Degenhardt wrote:



Very nice post!


Thanks! If it gets half as many page views as yours did, I'll 
be happy. Yours is the most-viewed post on the blog -- over 
1000 views more than #2 (my GC post), and 5,000 more than #3 
(A New Import Idiom).


I was surprised it's so popular, as the proggit thread didn't 
do that great, but it did well on HN and I now see it inspired 
more posts for Rust (written by bearophile, I think) and Go, in 
addition to the Nim post linked here before:


https://users.rust-lang.org/t/faster-command-line-tools-in-d-rust/10992
https://aadrake.com/posts/2017-05-29-faster-command-line-tools-with-go.html


I was surprised as well, pleasantly of course. Using a simple 
example may have helped. Personally, I'm not bothered by the 
specific instances of negative feedback on Reddit. It's hard to 
write a post that manages to avoid that sort of thing entirely. 
It was also nice to see related follow-up in the D forums ("how 
to count lines fast" and "std.csv Performance Review"). It's less 
if the case for how well suited D's facilities are for the type 
of problem came across. It's much more clear in the Compile-Time 
Sort post.


--Jon

Re: Compile-Time Sort in D

2017-06-05 Thread Jon Degenhardt via Digitalmars-d-announce


On Monday, 5 June 2017 at 14:23:34 UTC, Mike Parker wrote:
The crowd-edited (?) blog post exploring some of D's 
compile-time features is now live. Thanks again to everyone who 
helped out with it.


The blog:
https://dlang.org/blog/2017/06/05/compile-time-sort-in-d/

Reddit:
https://www.reddit.com/r/programming/comments/6fefdg/compiletime_sort_in_d/


Very nice post!

Re: Faster Command Line Tools in D

2017-05-24 Thread Jon Degenhardt via Digitalmars-d-announce


On Thursday, 25 May 2017 at 05:17:29 UTC, Walter Bright wrote:
Any time one writes an article comparing speed between 
languages X and Y, someone gets their ox gored and will 
bitterly complain about how unfair the article is (though I 
noticed that none of the complainers wrote a faster Python 
version). Even if you tried to optimize the Python program, 
you'll be inevitably accused of deliberately not doing it right.


The nadir of this for me was when I compared Digital Mars C++ 
code with DMD. Both share the same optimizer and back end, yet 
I was accused of "sabotaging" my own C++ compiler in order to 
make D look better !! Me, I just don't do public comparison 
benchmarking anymore. It's a waste of time arguing with people 
about it.


I thought you wrote a fine article, and the criticism about the 
Python code was unwarranted (especially since nobody suggested 
better code), because the article was about optimizing D code, 
not optimizing Python.


Thanks Walter, I appreciate your comments. And correct, as 
multiple people noted, a speed comparison with other languages 
not at all a goal of the article.


The real intent was to tell a story of how several of D's 
features play together to enable optimizations like this, without 
having to write low-level code or step outside the core language 
features and standard library.


--Jon

Re: Faster Command Line Tools in D

2017-05-24 Thread Jon Degenhardt via Digitalmars-d-announce


On Wednesday, 24 May 2017 at 21:46:10 UTC, cym13 wrote:

On Wednesday, 24 May 2017 at 21:34:08 UTC, Walter Bright wrote:

It's now #4 on the front page of Hacker News:

https://news.ycombinator.com/news


The comments on HN are useless though, everybody went for the 
"D versus Python" thing and seem to complain that it's doing a 
D/Python benchmark while only talking about D 
optimization...even though optimizing D is the whole point of 
the article. In the same way they rant against the fact that 
many iterations on the D script are shown while it is obviously 
to give different tricks while being clear on what trick gives 
what.


I am disappointed because there are so many good things to say 
about this, so many good questions or remarks to make when not 
familiar with the language, and yet all we get is "Meh, this 
benchmark shows nothing of D's speed against Python".


Its not easy writing an article that doesn't draw some form of 
criticism. FWIW, the reason I gave a Python example is because it 
is very commonly used for this type of problem and the language 
is well suited to it. A second reason is that I've seen several 
posts where someone has tried to rewrite a Python program like 
this in D, start with `split`, and wonder how to make it faster. 
My hope is that this will clarify how to achieve this.


Another goal of the article was to describe how performance in 
the TSV Utilities had been achieved. The article is not about the 
TSV Utilities, but discussing both the benchmark results and how 
they had been achieved would be a very long article.


--Jon

Re: Faster Command Line Tools in D

2017-05-24 Thread Jon Degenhardt via Digitalmars-d-announce


On Wednesday, 24 May 2017 at 17:36:29 UTC, cym13 wrote:

On Wednesday, 24 May 2017 at 13:39:57 UTC, Mike Parker wrote:
[...snip...]

A bit off topic but I really like that we still get quality 
content such as this post on this blog. Sustained quality is 
hard job and I thank everyone involved for that.


The complement to the community is well deserved, thank you for 
including this post in the company. In this case, the post 
benefited from some really excellent review feedback and Mike 
made the publication side really easy.


--Jon

Re: [OT] Fast Deterministic Selection

2017-05-20 Thread Jon Degenhardt via Digitalmars-d-announce

On Thursday, 18 May 2017 at 15:14:17 UTC, Andrei Alexandrescu 
wrote:
The implementation is an improved version of what we now have 
in the D standard library. I'll take up the task of updating 
phobos at a later time.


https://www.reddit.com/r/programming/comments/6bwsjn/fast_deterministic_selection_sea_2017_now_with/


Andrei


Very nice! Is this materially faster than what is currently in 
Phobos (PR 4815)? That update was a substantial performance win 
by itself.


--Jon

Re: dmd Backend converted to Boost License

2017-04-07 Thread Jon Degenhardt via Digitalmars-d-announce


On Friday, 7 April 2017 at 15:14:40 UTC, Walter Bright wrote:

https://github.com/dlang/dmd/pull/6680

Yes, this is for real! Symantec has given their permission to 
relicense it. Thank you, Symantec!


Congrats, this is a great result!

Re: Updates to the tsv-utils toolkit

2017-03-04 Thread Jon Degenhardt via Digitalmars-d-announce

On Wednesday, 22 February 2017 at 18:12:50 UTC, Jon Degenhardt
wrote:
It's not quite a year since the open-sourcing of eBay's tsv
utilities. Since then there have been a number of additions and
updates, and the tools form a more complete package. The tools
assist with manipulation of tabular data files common in
machine learning and data mining environments. They work
alongside traditional Unix command line tools like 'cut', and
'sort'. They also fit well with data mining and stats packages
like R and Pandas.

The tools include filtering, slicing, joins and other
manipulation, sampling, and statistical calculations. If you
find yourself working with large data files from a unix shell,
you may like these tools.

Speed matters when processing large data files, and these tools
are fast. I've published new benchmarks comparing the tools to
similar tools written in several native compiled programming
languages. The tools are the fastest on five of the six
benchmarks run, generally by significant margins. It's a good
result for the D programming language. The benchmarks may be of
interest regardless of your interest in the tools themselves.

Repository: https://github.com/eBay/tsv-utils-dlang
Performance benchmarks:
https://github.com/eBay/tsv-utils-dlang/blob/master/docs/Performance.md

--Jon

One more update: Schveiguy helped identify the performance
bottleneck in the csv2tsv tool, now the tools are the fastest on
all six benchmarks. Benchmarks have been updated (and reformatted
a bit). Summary table here:
https://github.com/eBay/tsv-utils-dlang/blob/master/docs/Performance.md#top-four-in-each-benchmark

Re: Updates to the tsv-utils toolkit

2017-02-22 Thread Jon Degenhardt via Digitalmars-d-announce


On Wednesday, 22 February 2017 at 21:07:43 UTC, bpr wrote:
On Wednesday, 22 February 2017 at 18:12:50 UTC, Jon Degenhardt 
wrote:

...snip...

Repository: https://github.com/eBay/tsv-utils-dlang
Performance benchmarks: 
https://github.com/eBay/tsv-utils-dlang/blob/master/docs/Performance.md


--Jon


This is very nice code, and a good result for D. I'll study 
this carefully. So much of data analysis is 
reading/transforming files...

...snip...


Thanks! Both for the feedback and for any evaluation you might 
do. Any insights or thoughts you may have would be quite welcome.


--Jon

Re: Updates to the tsv-utils toolkit

2017-02-22 Thread Jon Degenhardt via Digitalmars-d-announce

On Wednesday, 22 February 2017 at 18:43:57 UTC, Jack Stouffer 
wrote:
On Wednesday, 22 February 2017 at 18:12:50 UTC, Jon Degenhardt 
wrote:
Speed matters when processing large data files, and these 
tools are fast. I've published new benchmarks comparing the 
tools to similar tools written in several native compiled 
programming languages. The tools are the fastest on five of 
the six benchmarks run, generally by significant margins. It's 
a good result for the D programming language.


Great news!


Agreed, an outstanding result. I had not anticipated the deltas.


The specialty toolkits have been anonymized in the tables below.
The purpose of these benchmarks is to gauge performance of the D
tools, not make comparisons between other toolkits.


You're no fun ;)


Yeah, I know. Not my style.

Updates to the tsv-utils toolkit

2017-02-22 Thread Jon Degenhardt via Digitalmars-d-announce

It's not quite a year since the open-sourcing of eBay's tsv 
utilities. Since then there have been a number of additions and 
updates, and the tools form a more complete package. The tools 
assist with manipulation of tabular data files common in machine 
learning and data mining environments. They work alongside 
traditional Unix command line tools like 'cut', and 'sort'. They 
also fit well with data mining and stats packages like R and 
Pandas.


The tools include filtering, slicing, joins and other 
manipulation, sampling, and statistical calculations. If you find 
yourself working with large data files from a unix shell, you may 
like these tools.


Speed matters when processing large data files, and these tools 
are fast. I've published new benchmarks comparing the tools to 
similar tools written in several native compiled programming 
languages. The tools are the fastest on five of the six 
benchmarks run, generally by significant margins. It's a good 
result for the D programming language. The benchmarks may be of 
interest regardless of your interest in the tools themselves.


Repository: https://github.com/eBay/tsv-utils-dlang
Performance benchmarks: 
https://github.com/eBay/tsv-utils-dlang/blob/master/docs/Performance.md


--Jon

Re: Silicon Valley D Meetup - January 26, 2017 - "High Performance Tools in D" by Jon Degenhardt

2017-02-18 Thread Jon Degenhardt via Digitalmars-d-announce


On Saturday, 18 February 2017 at 07:50:02 UTC, Joakim wrote:
On Friday, 27 January 2017 at 18:20:53 UTC, Jon Degenhardt 
wrote:
On Friday, 27 January 2017 at 16:21:51 UTC, Jack Stouffer 
wrote:

On Friday, 27 January 2017 at 03:58:26 UTC, Ali Çehreli wrote:

And this:

  http://youtu.be/-DK4r5xewTY


Hey Jon, if you're in this thread, are you able to post any 
of the code that you use for tsv parsing?


Code has been open-sourced: 
https://github.com/eBay/tsv-utils-dlang


The performance benchmarks showed in the talk are not in the 
repo, the benchmarks currently listed are from a year ago. I'm 
planning to update the repo in the next few weeks, probably 
after the next LDC release.


If there are questions about specific types of things perhaps 
a thread in General forum would work.


--Jon


Watched the video some time back, interesting results.  Any 
plans to blog about this?  It would be great if you could run 
them through a profiler too, see why D is so much faster.  
Would be really worth writing this up, maybe on the D blog.


Thanks for the feedback. I'm pretty close to publishing the 
benchmarks, they'll go in a doc file in the repository. They 
weren't quite complete when the meetup happened.


Regarding a blog post - I haven't talked to Mike Parker, if 
there's interest I'd be open to it.


As to why the tools compare so well - That's a really intriguing 
question, especially since the tools favor using high level 
constructs from D / Phobos rather than hand-built data structures 
or memory management. I have hypotheses, but no sure answers. 
Some of it likely involves design choices rather than language 
facilities per se, but even so, it's a good story for D.


--Jon

Re: two points

2017-02-09 Thread Jon Degenhardt via Digitalmars-d-announce

On Thursday, 9 February 2017 at 16:48:16 UTC, Joseph Rushton 
Wakeling wrote:
There's clearly in part a scaling problem here (in terms of how 
many people are available in general, and in terms of how many 
people have expertise on particular parts of the library) but 
it also feels like a few simple things (like making sure every 
PR author is given a reliable contact or two who they can feel 
entitled to chase up) could make a big difference.


Regarding the scaling problem - Perhaps the bug system could be 
used to help engage a wider community of reviewers. Specifically, 
update the bugzilla ticket early in the PR lifecycle as an 
alerting mechanism.


This idea comes from my experiences so far. I've found any number 
of bugs and enhancements in the bug system that directly interact 
with things I'm implementing. I typically add myself to CC list 
so I hear about changes. In many of these cases I'd be willing to 
help with reviewing. However, when a PR associated with the issue 
is created, the ticket itself is normally not updated until after 
the review is finished and the PR closed, to late to help out.


Of course, someone like myself, a part-timer to the community at 
best, should not be a primary reviewer. However, for specific 
issues, it's often the case that I've studied the area of code 
involved. If there is a wider set of people in a similar 
situation perhaps this might help engage a wider set of people.


--Jon

Re: Silicon Valley D Meetup - January 26, 2017 - "High Performance Tools in D" by Jon Degenhardt

2017-01-27 Thread Jon Degenhardt via Digitalmars-d-announce


On Friday, 27 January 2017 at 20:48:30 UTC, Ali Çehreli wrote:

On 01/27/2017 08:21 AM, Jack Stouffer wrote:

On Friday, 27 January 2017 at 03:58:26 UTC, Ali Çehreli wrote:

And this:

  http://youtu.be/-DK4r5xewTY


Hey Jon, if you're in this thread, are you able to post any of 
the code

that you use for tsv parsing?


Yeah, the slide starting at 19'35 is the most interesting:

  
https://www.youtube.com/watch?v=-DK4r5xewTY&feature=youtu.be&t=1175


Tools written in D (mostly with Phobos and with GC) are at 
least 3 times faster! Let's verify the results and then make 
some noise. :)


Ali


An independent verification of the results would be fantastic. 
Any time a single person does this type of benchmark, especially 
the author of the tool, there's real risk of an error. In this 
case I took every reasonable step I knew to be diligent about it, 
but still. And yes, the deltas are impressive. I was surprised.

Re: Silicon Valley D Meetup - January 26, 2017 - "High Performance Tools in D" by Jon Degenhardt

2017-01-27 Thread Jon Degenhardt via Digitalmars-d-announce


On Friday, 27 January 2017 at 16:21:51 UTC, Jack Stouffer wrote:

On Friday, 27 January 2017 at 03:58:26 UTC, Ali Çehreli wrote:

And this:

  http://youtu.be/-DK4r5xewTY


Hey Jon, if you're in this thread, are you able to post any of 
the code that you use for tsv parsing?


Code has been open-sourced: 
https://github.com/eBay/tsv-utils-dlang


The performance benchmarks showed in the talk are not in the 
repo, the benchmarks currently listed are from a year ago. I'm 
planning to update the repo in the next few weeks, probably after 
the next LDC release.


If there are questions about specific types of things perhaps a 
thread in General forum would work.


--Jon

Command line tool for weighted reservoir sampling

2017-01-22 Thread Jon Degenhardt via Digitalmars-d-announce

I released a new tool for weighted random sampling of tabular 
data files: tsv-sample. It's one of several tools recently added 
to tsv file toolkit I released last year. These tools are 
especially useful when data files are larger than is desirable to 
read entirely into memory in R and similar apps.


I'll publish an announcement of broader set of tools updates in 
the next few weeks. I have some performance benchmarks to finish 
first. However, weighted reservoir sampling algorithms are 
interesting, I thought there might be enough interest to warrant 
a separate announcement.


Repo: https://github.com/eBay/tsv-utils-dlang
tsv-sample code: 
https://github.com/eBay/tsv-utils-dlang/blob/master/tsv-sample/src/tsv-sample.d


--Jon

Re: Beta 2.073.0-b1

2017-01-06 Thread Jon Degenhardt via Digitalmars-d-announce


On Saturday, 7 January 2017 at 05:02:13 UTC, Martin Nowak wrote:

First beta for the 2.073.0 release.

This release comes with a few phobos additions, a new -mcpu=avx 
switch, an experimental safety checks 
(-transition=safe/-dip1000), and several bugfixes.


http://dlang.org/download.html#dmd_beta 
http://dlang.org/changelog/2.073.0.html


Please report any bugs at https://issues.dlang.org

-Martin


The change log should probably include the topN rewrite. PR 4815, 
several issue reports.


--Jon

Re: The D Language Foundation is now a tax exempt non-profit organization

2016-08-30 Thread Jon Degenhardt via Digitalmars-d-announce

On Monday, 29 August 2016 at 17:03:34 UTC, Andrei Alexandrescu 
wrote:
We're happy to report that the D Language Foundation is now a 
public charity operating under US Internal Revenue Code Section 
501(c)(3). The decision is retroactive to September 23, 2015.


This has wide-ranging implications, the most important being 
that individuals and organizations may make tax deductible 
bequests, devises, transfers, or gifts to the Foundation. We 
will mull over defining donation and sponsorship packages in 
the near future. If interesting in donating spontaneously, feel 
free to reach out to us via email at foundat...@dlang.org.


Many thanks are due to the folks in this community who asked 
for and supported this initiative.




Fantastic! Congrats, nice work!

53 matches

Mail list logo