Re: Efficiently streaming data to associative array

2017-08-09 Thread Guillaume Chatelet via Digitalmars-d-learn

On Wednesday, 9 August 2017 at 10:00:14 UTC, kerdemdemir wrote:
As a total beginner I am feeling a bit not comfortable with 
basic operations in AA.


First even I am very happy we have pointers but using pointers 
in a common operation like this IMHO makes the language a bit 
not safe.


Second "in" keyword always seemed so specific to me.

I think I will use your solution "ref Value 
GetWithDefault(Value)" very often since it hides the two things 
above.


You don't need this most of the time, if you already have the 
correct type it's easy:


size_t[string][string] indexed_map;

string a, b; // a and b are strings not char[]
indexed_map[a][b] = value; // this will create the AA slots if 
needed


In my specific case the data is streamed from stdin and is not 
kept in memory.
byLine returns a view of the stdin buffer which may be replaced 
at the next for-loop iteration so I can't use the index operator 
directly, I need a string that does not change over time.


I could have used this code:

void main() {
  size_t[string][string] indexed_map;
  foreach(char[] line ; stdin.byLine) {
char[] a;
char[] b;
size_t value;
line.formattedRead!"%s,%s,%d"(a,b,value);
indexed_map[a.idup][b.idup] = value;
  }
  indexed_map.writeln;
}

It's perfectly ok if data is small. In my case data is huge and 
creating a copy of the strings at each iteration is costly.


Re: Efficiently streaming data to associative array

2017-08-08 Thread Guillaume Chatelet via Digitalmars-d-learn
On Tuesday, 8 August 2017 at 16:00:17 UTC, Steven Schveighoffer 
wrote:

On 8/8/17 11:28 AM, Guillaume Chatelet wrote:
Let's say I'm processing MB of data, I'm lazily iterating over 
the incoming lines storing data in an associative array. I 
don't want to copy unless I have to.


Contrived example follows:

input file
--
a,b,15
c,d,12


Efficient ingestion
---
void main() {

   size_t[string][string] indexed_map;

   foreach(char[] line ; stdin.byLine) {
 char[] a;
 char[] b;
 size_t value;
 line.formattedRead!"%s,%s,%d"(a,b,value);

 auto pA = a in indexed_map;
 if(pA is null) {
   pA = &(indexed_map[a.idup] = (size_t[string]).init);
 }

 auto pB = b in (*pA);
 if(pB is null) {
   pB = &((*pA)[b.idup] = size_t.init
 }

 // Technically unneeded but let's say we have more than 2 
dimensions.

 (*pB) = value;
   }

   indexed_map.writeln;
}


I qualify this code as ugly but fast. Any idea on how to make 
this less ugly? Is there something in Phobos to help?


I wouldn't use formattedRead, as I think this is going to 
allocate temporaries for a and b.


Note, this is very close to Jon Degenhardt's blog post in May: 
https://dlang.org/blog/2017/05/24/faster-command-line-tools-in-d/


-Steve


I haven't yet dug into formattedRead but thx for letting me know 
: )
I was mostly speaking about the pattern with the AA. I guess the 
best I can do is a templated function to hide the ugliness.



ref Value GetWithDefault(Value)(ref Value[string] map, const 
(char[]) key) {

  auto pValue = key in map;
  if(pValue) return *pValue;
  return map[key.idup] = Value.init;
}

void main() {

  size_t[string][string] indexed_map;

  foreach(char[] line ; stdin.byLine) {
char[] a;
char[] b;
size_t value;
line.formattedRead!"%s,%s,%d"(a,b,value);

indexed_map.GetWithDefault(a).GetWithDefault(b) = value;
  }

  indexed_map.writeln;
}


Not too bad actually !


Efficiently streaming data to associative array

2017-08-08 Thread Guillaume Chatelet via Digitalmars-d-learn
Let's say I'm processing MB of data, I'm lazily iterating over 
the incoming lines storing data in an associative array. I don't 
want to copy unless I have to.


Contrived example follows:

input file
--
a,b,15
c,d,12
...

Efficient ingestion
---
void main() {

  size_t[string][string] indexed_map;

  foreach(char[] line ; stdin.byLine) {
char[] a;
char[] b;
size_t value;
line.formattedRead!"%s,%s,%d"(a,b,value);

auto pA = a in indexed_map;
if(pA is null) {
  pA = &(indexed_map[a.idup] = (size_t[string]).init);
}

auto pB = b in (*pA);
if(pB is null) {
  pB = &((*pA)[b.idup] = size_t.init);
}

// Technically unneeded but let's say we have more than 2 
dimensions.

(*pB) = value;
  }

  indexed_map.writeln;
}


I qualify this code as ugly but fast. Any idea on how to make 
this less ugly? Is there something in Phobos to help?


Re: Floating point rounding

2017-03-02 Thread Guillaume Chatelet via Digitalmars-d-learn

On Thursday, 2 March 2017 at 21:34:56 UTC, ag0aep6g wrote:

On 03/02/2017 10:10 PM, Guillaume Chatelet wrote:
On Thursday, 2 March 2017 at 20:30:47 UTC, Guillaume Chatelet 
wrote:

Here is the same code in D:
void main(string[] args)
{
import std.math;
FloatingPointControl fpctrl;
fpctrl.rounding = FloatingPointControl.roundUp;
writefln("%.32g", float.min_normal + 1.0f);
}

Execution on my machine yields:
dmd -run test_denormal.d
1

Did I miss something?


This example is closer to the C++ one:

void main(string[] args)
{
import core.stdc.fenv;
fesetround(FE_UPWARD);
writefln("%.32g", float.min_normal + 1.0f);
}

It still yields "1"


This prints the same as the C++ version:


void main(string[] args)
{
import std.stdio;
import core.stdc.fenv;
fesetround(FE_UPWARD);
float x = 1.0f;
x += float.min_normal;
writefln("%.32g", x);
}


Soo, a bug/limitation of constant folding?

With FloatingPointControl it still prints "1". Does 
FloatingPointControl.rounding do something different than 
fesetround? The example in the docs [1] only shows how it 
changes rint's behavior.



[1] http://dlang.org/phobos/std_math.html#.FloatingPointControl


Thx for the investigation!
Here is the code for FloatingPointControl
https://github.com/dlang/phobos/blob/master/std/math.d#L4809

Other code (enableExceptions / disableExceptions) seems to have 
two code path depending on "version(X86_Any)", rounding doesn't.


Maybe that's the bug?



Re: Floating point rounding

2017-03-02 Thread Guillaume Chatelet via Digitalmars-d-learn
On Thursday, 2 March 2017 at 20:30:47 UTC, Guillaume Chatelet 
wrote:

Here is the same code in D:
void main(string[] args)
{
import std.math;
FloatingPointControl fpctrl;
fpctrl.rounding = FloatingPointControl.roundUp;
writefln("%.32g", float.min_normal + 1.0f);
}

Execution on my machine yields:
dmd -run test_denormal.d
1

Did I miss something?


This example is closer to the C++ one:

void main(string[] args)
{
import core.stdc.fenv;
fesetround(FE_UPWARD);
writefln("%.32g", float.min_normal + 1.0f);
}

It still yields "1"


Floating point rounding

2017-03-02 Thread Guillaume Chatelet via Digitalmars-d-learn
I would expect that (1.0f + smallest float subnormal) > 1.0f when 
the Floating Point unit is set to Round Up.


Here is some C++ code:
#include 
#include 
#include 

int main(int, char**) {
std::fesetround(FE_UPWARD);
printf("%.32g\n", std::numeric_limits::denorm_min() + 
1.0f);

return 0;
}

Execution on my machine yields:
clang++ --std=c++11 test_denormal.cc && ./a.out
1.0011920928955078125

Here is the same code in D:
void main(string[] args)
{
import std.math;
FloatingPointControl fpctrl;
fpctrl.rounding = FloatingPointControl.roundUp;
writefln("%.32g", float.min_normal + 1.0f);
}

Execution on my machine yields:
dmd -run test_denormal.d
1

Did I miss something?



Re: Bug in csv or byLine ?

2016-01-11 Thread Guillaume Chatelet via Digitalmars-d-learn

On Sunday, 10 January 2016 at 19:50:15 UTC, Tobi G. wrote:
On Sunday, 10 January 2016 at 19:07:52 UTC, Jesse Phillips 
wrote:

On Sunday, 10 January 2016 at 18:09:23 UTC, Tobi G. wrote:

The bug has been fixed...


Do you have a link for the fix? Is there a BugZilla entry?


Yes sure..

https://issues.dlang.org/show_bug.cgi?id=15545

and the fix at github

https://github.com/D-Programming-Language/phobos/pull/3917


togrue


Thx for the fix !


Bug in csv or byLine ?

2016-01-08 Thread Guillaume Chatelet via Digitalmars-d-learn

$ cat debug.csv
timestamp,curr_property
2015-12-01 06:07:55,7035

$ cat process.d
import std.stdio;
import std.csv;
import std.algorithm;
import std.file;

void main(string[] args) {
  version (Fail) {
File(args[1], "r").byLine.joiner("\n").csvReader.each!writeln;
  } else {
readText(args[1]).csvReader.each!writeln;
  }
}

$ dmd -run ./process.d debug.csv
["timestamp", "curr_property"]
["2015-12-01 06:07:55", "7035"]

$ dmd -version=Fail -run ./process.d debug.csv
["timestamp", "curr_property"]
["2015-12-01 06:07:55", "7035"]
core.exception.AssertError@std/algorithm/iteration.d(2027): 
Assertion failure


??:? _d_assert [0x4633d3]
??:? void std.algorithm.iteration.__assert(int) [0x46d770]
??:? pure @property @safe dchar 
std.algorithm.iteration.joiner!(std.stdio.File.ByLine!(char, 
char).ByLine, 
immutable(char)[]).joiner(std.stdio.File.ByLine!(char, 
char).ByLine, immutable(char)[]).Result.front() [0x44eaf0]
??:? void std.csv.CsvReader!(immutable(char)[], 1, 
std.algorithm.iteration.joiner!(std.stdio.File.ByLine!(char, 
char).ByLine, 
immutable(char)[]).joiner(std.stdio.File.ByLine!(char, 
char).ByLine, immutable(char)[]).Result, dchar, 
immutable(char)[][]).CsvReader.popFront() [0x44f7fc]
??:? void 
std.algorithm.iteration.__T4eachS183std5stdio7writelnZ.each!(std.csv.CsvReader!(immutable(char)[], 1, std.algorithm.iteration.joiner!(std.stdio.File.ByLine!(char, char).ByLine, immutable(char)[]).joiner(std.stdio.File.ByLine!(char, char).ByLine, immutable(char)[]).Result, dchar, immutable(char)[][]).CsvReader).each(std.csv.CsvReader!(immutable(char)[], 1, std.algorithm.iteration.joiner!(std.stdio.File.ByLine!(char, char).ByLine, immutable(char)[]).joiner(std.stdio.File.ByLine!(char, char).ByLine, immutable(char)[]).Result, dchar, immutable(char)[][]).CsvReader) [0x4608f7]

??:? _Dmain [0x44bc93]


Any idea ?


Re: Bug in csv or byLine ?

2016-01-08 Thread Guillaume Chatelet via Digitalmars-d-learn

On Friday, 8 January 2016 at 13:22:40 UTC, Tobi G. wrote:
On Friday, 8 January 2016 at 12:13:59 UTC, Guillaume Chatelet 
wrote:

On Friday, 8 January 2016 at 12:07:05 UTC, Tobi G. wrote:
No, sorry. Under Windows DMD v2.069.2 it works perfectly in 
both cases.


Which compiler do you use?


- DMD64 D Compiler v2.069.2 on Linux.
- LDC 0.16.1 (DMD v2.067.1, LLVM 3.7.0)


I ran it now under Linux/Ubuntu DMD64 D Compiler v2.069.2

But both still worked..

Are there some characters in your input data which are invalid 
and not displayed in the forum?

(multiple empty lines after the actual csv data for example)

togrue


Indeed there's an empty line at the end of the csv.

Interestingly enough if I try with DMD64 D Compiler v2.069, the 
Fail version runs fine but the normal version returns:
std.csv.CSVException@/usr/include/dlang/dmd/std/csv.d(1246): Row 
3's length 1 does not match previous length of 2.


Re: Bug in csv or byLine ?

2016-01-08 Thread Guillaume Chatelet via Digitalmars-d-learn

On Friday, 8 January 2016 at 12:07:05 UTC, Tobi G. wrote:
No, sorry. Under Windows DMD v2.069.2 it works perfectly in 
both cases.


Which compiler do you use?


- DMD64 D Compiler v2.069.2 on Linux.
- LDC 0.16.1 (DMD v2.067.1, LLVM 3.7.0)

So if it works on windows I guess it's a problem with the File 
implementation.


You could run DMD with the -g option. This will print often 
more useful output, if it fails.


-g didn't bring much.

core.exception.AssertError@std/algorithm/iteration.d(2027): 
Assertion failure


??:? _d_assert [0x4a9c33]
??:? void std.algorithm.iteration.__assert(int) [0x4b8048]
/usr/include/dmd/phobos/std/algorithm/iteration.d:2027 pure 
@property @safe dchar 
std.algorithm.iteration.joiner!(std.stdio.File.ByLine!(char, 
char).ByLine, 
immutable(char)[]).joiner(std.stdio.File.ByLine!(char, 
char).ByLine, immutable(char)[]).Result.front() [0x495330]
/usr/include/dmd/phobos/std/csv.d:1018 void 
std.csv.CsvReader!(immutable(char)[], 1, 
std.algorithm.iteration.joiner!(std.stdio.File.ByLine!(char, 
char).ByLine, 
immutable(char)[]).joiner(std.stdio.File.ByLine!(char, 
char).ByLine, immutable(char)[]).Result, dchar, 
immutable(char)[][]).CsvReader.popFront() [0x49608c]
/usr/include/dmd/phobos/std/algorithm/iteration.d:881 void 
std.algorithm.iteration.__T4eachS183std5stdio7writelnZ.each!(std.csv.CsvReader!(immutable(char)[], 1, std.algorithm.iteration.joiner!(std.stdio.File.ByLine!(char, char).ByLine, immutable(char)[]).joiner(std.stdio.File.ByLine!(char, char).ByLine, immutable(char)[]).Result, dchar, immutable(char)[][]).CsvReader).each(std.csv.CsvReader!(immutable(char)[], 1, std.algorithm.iteration.joiner!(std.stdio.File.ByLine!(char, char).ByLine, immutable(char)[]).joiner(std.stdio.File.ByLine!(char, char).ByLine, immutable(char)[]).Result, dchar, immutable(char)[][]).CsvReader) [0x4a5063]

./process.d:8 _Dmain [0x49226c]



Re: Idiomatic adjacent_difference

2015-10-16 Thread Guillaume Chatelet via Digitalmars-d-learn

On Friday, 16 October 2015 at 11:38:35 UTC, John Colvin wrote:

import std.range, std.algorithm;

auto slidingWindow(R)(R r, size_t n)
if(isForwardRange!R)
{
//you could definitely do this with less overhead
return roundRobin(r.chunks(n), r.save.drop(1).chunks(n))
.filter!(p => p.length == n);
}

auto adjacentDiff(R)(R r)
{
return r.slidingWindow(2).map!"a[1] - a[0]";
}


Nice !
I wanted to use lockstep(r, r.dropOne) but it doesn't return a 
Range :-/

It has to be used in a foreach.


Idiomatic adjacent_difference

2015-10-16 Thread Guillaume Chatelet via Digitalmars-d-learn

Is there an idiomatic way to do:

int[] numbers = [0, 1, 2, 3];
assert(adjacent_diff(numbers) == [1, 1, 1]);

I can't find something useful in the std library.


Re: Idiomatic adjacent_difference

2015-10-16 Thread Guillaume Chatelet via Digitalmars-d-learn

On Friday, 16 October 2015 at 12:03:56 UTC, Per Nordlöw wrote:
On Friday, 16 October 2015 at 11:48:19 UTC, Edwin van Leeuwen 
wrote:

zip(r, r[1..$]).map!((t) => t[1]-t[0]);


And for InputRanges (not requiring random-access):

zip(r, r.dropOne).map!((t) => t[1]-t[0]);


That's neat. Thx guys :)


Re: Nested C++ namespace library linking

2015-01-21 Thread Guillaume Chatelet via Digitalmars-d-learn

On Wednesday, 21 January 2015 at 14:59:15 UTC, John Colvin wrote:

Looks like a bug to me, for sure.

In the mean-time you may be able to use some pragma(mangle, 
...) hacks to force the compiler to emit the right symbols.


Thx John,

extern(C++, A.B) {
  struct Type {}
  pragma(mangle,_ZN1A1B3fooENS0_4TypeE) int foo(Type unused);
}

is indeed linking correctly :)


Nested C++ namespace library linking

2015-01-20 Thread Guillaume Chatelet via Digitalmars-d-learn

Consider the following foo.cpp

namespace A {
namespace B {
  struct Type {};
  int foo(Type unused){ return 42; }
}
}

Compile it : g++ foo.cpp -c -o foo.o

Then the following main.d

extern(C++, A.B) {
  struct Type {}
  int foo(Type unused);
}

void main() {
  foo(Type());
}

Compile it : dmd main.d foo.o
It fails with : undefined reference to « A::B::foo(A::Type) »

It looks like the Type is not resolved in the right namespace. 
A::Type instead of A::B::Type. Did I miss something or is this a 
bug ?


I also tried fully qualifying foo and Type but I end up with the 
exact same error :

  A.B.foo(A.B.Type());


Re: Nested C++ namespace library linking

2015-01-20 Thread Guillaume Chatelet via Digitalmars-d-learn

That's what I thought.
I reported this bug a while ago but it didn't get a lot of 
attention.

https://issues.dlang.org/show_bug.cgi?id=13337


Re: Should formattedWrite take the outputrange by ref?

2014-09-03 Thread Guillaume Chatelet via Digitalmars-d-learn

+1 I've been bitten by this also.