Re: Formatted read consumes input

2012-09-08 Thread kenji hara
I have commented to the pull.
I don't like adding convenient interfaces to std.format module.

https://github.com/D-Programming-Language/phobos/pull/777#issuecomment-8385551

Kenji Hara

2012/9/8 monarch_dodra monarchdo...@gmail.com:
 On Friday, 7 September 2012 at 15:34:12 UTC, monarch_dodra wrote:

 I think this is a good solution. Do you see anything I may have failed to
 see?


 I've made a pull request out of it.

 https://github.com/D-Programming-Language/phobos/pull/777


Re: Formatted read consumes input

2012-09-08 Thread kenji hara
2012/9/8 monarch_dodra monarchdo...@gmail.com:
[snip]

 Still, I find it horrible to have to create a named dummy variable just
 when I simply want to pass a copy of my range.

Why you are afraid to declaring dummy variable?
formattedRead is a parser, not an algorithm (as I said in the pull
request comment). After calling it, zero or more elements will remain.
And, in almost cases, the remains will be used other purpose, or just
checked that is empty.

int n = formattedRead(input_range, fmt, args...);
next_parsing(input_range);   // reusing input_range
assert(input_range.empty);  // or just checked that is empty

If formattedRead can receive rvalue, calling it would ignore the
remains, and it will cause hidden bug.

int n = formattedRead(r.save, fmt, args...);
// If the remains is not empty, it is ignored. Is this expected, or
something logical bug?

auto dummy = r.save;
int n = formattedRead(dummy, fmt, args...);
assert(dummy.empty);   // You can assert that remains should be empty.

formattedRead returns multiple states (the values which are read, how
many values are read, and remains of input), so allowing to ignore
them would introduce bad usage and possibilities of bugs.

Kenji Hara


Re: Formatted read consumes input

2012-09-08 Thread monarch_dodra

On Saturday, 8 September 2012 at 12:10:26 UTC, kenji hara wrote:

2012/9/8 monarch_dodra monarchdo...@gmail.com:
[snip]


Still, I find it horrible to have to create a named dummy 
variable just

when I simply want to pass a copy of my range.


Why you are afraid to declaring dummy variable?
formattedRead is a parser, not an algorithm (as I said in the 
pull
request comment). After calling it, zero or more elements will 
remain.
And, in almost cases, the remains will be used other purpose, 
or just

checked that is empty.

int n = formattedRead(input_range, fmt, args...);
next_parsing(input_range);   // reusing input_range
assert(input_range.empty);  // or just checked that is empty

If formattedRead can receive rvalue, calling it would ignore the
remains, and it will cause hidden bug.

int n = formattedRead(r.save, fmt, args...);
// If the remains is not empty, it is ignored. Is this 
expected, or

something logical bug?

auto dummy = r.save;
int n = formattedRead(dummy, fmt, args...);
assert(dummy.empty);   // You can assert that remains should be 
empty.


formattedRead returns multiple states (the values which are 
read, how
many values are read, and remains of input), so allowing to 
ignore

them would introduce bad usage and possibilities of bugs.

Kenji Hara


Hum, I think I see your point, although in my opinion, checking 
the return value is all that is required for generic error 
checking.


Checking the state of the range afterwards is being super extra 
careful for a specific use case, and should not necessarilly be 
forced onto the programmer.


I'll close the pull in the morning.


Re: Formatted read consumes input

2012-09-08 Thread monarch_dodra

On Saturday, 8 September 2012 at 12:10:26 UTC, kenji hara wrote:

2012/9/8 monarch_dodra monarchdo...@gmail.com:
[snip]


Still, I find it horrible to have to create a named dummy 
variable just

when I simply want to pass a copy of my range.


Why you are afraid to declaring dummy variable?
formattedRead is a parser, not an algorithm (as I said in the 
pull
request comment). After calling it, zero or more elements will 
remain.
And, in almost cases, the remains will be used other purpose, 
or just

checked that is empty.

int n = formattedRead(input_range, fmt, args...);
next_parsing(input_range);   // reusing input_range
assert(input_range.empty);  // or just checked that is empty

If formattedRead can receive rvalue, calling it would ignore the
remains, and it will cause hidden bug.

int n = formattedRead(r.save, fmt, args...);
// If the remains is not empty, it is ignored. Is this 
expected, or

something logical bug?

auto dummy = r.save;
int n = formattedRead(dummy, fmt, args...);
assert(dummy.empty);   // You can assert that remains should be 
empty.


formattedRead returns multiple states (the values which are 
read, how
many values are read, and remains of input), so allowing to 
ignore

them would introduce bad usage and possibilities of bugs.

Kenji Hara


Hum, I think I see your point, although in my opinion, checking 
the return value is all that is required for generic error 
checking.


Checking the state of the range afterwards is being super extra 
careful for a specific use case, and should not necessarilly be 
forced onto the programmer.


I'll close the pull in the morning.


Re: Formatted read consumes input

2012-09-07 Thread Steven Schveighoffer
On Thu, 23 Aug 2012 07:33:13 -0400, monarch_dodra monarchdo...@gmail.com  
wrote:



As title implies:


import std.stdio;
import std.format;

void main()
{
   string s = 42;
   int v;
   formattedRead(s, %d, v);
   writefln([%s] [%s], s, v);
}

[] [42]


Is this the expected behavior?

Furthermore, it is not possible to try to save s:

import std.stdio;
import std.format;
import std.range;

void main()
{
   string s = 42;
   int v;
   formattedRead(s.save, %d, v);
   writefln([%s] [%s], s, v);
}

main.d(9): Error: template std.format.formattedRead does not match any  
function template declaration
C:\D\dmd.2.060\dmd2\windows\bin\..\..\src\phobos\std\format.d(526):  
Error: template std.format.formattedRead(R,Char,S...) cannot deduce  
template function from argument types !()(string,string,int*)



The workaround is to have a named backup:
   auto ss = s.save;
   formattedRead(ss, %d, v);


I've traced the root issue to formattedRead's signature, which is:
uint formattedRead(R, Char, S...)(ref R r, const(Char)[] fmt, S args);

Is there a particular reason for this pass by ref? It is inconsistent  
with the rest of phobos, or even C's scanf?


Is this a file-able bug_report/enhancement_request?


I believe it behaves as designed, but could be designed in such a way that  
does not need ref input range.  In fact, I think actually R needing to be  
ref is a bad thing.  Consider that if D didn't consider string literals to  
be lvalues (an invalid assumption IMO), then passing a string literal as  
the input would not work!


The only issue is, what if you *do* want ref behavior for strings?  You  
would need to wrap the string into a ref'd range.  That is not a good  
proposition.  Unfortunately, the way IFTI works, there isn't an  
opportunity to affect the parameter type IFTI decides to use.


I think a reasonable enhancement would be to add a formattedReadNoref (or  
better named alternative) that does not take a ref argument.


-Steve


Re: Formatted read consumes input

2012-09-07 Thread monarch_dodra
On Friday, 7 September 2012 at 13:58:43 UTC, Steven Schveighoffer 
wrote:

On Thu, 23 Aug 2012 07:33:13 -0400, monarch_dodra

The only issue is, what if you *do* want ref behavior for 
strings?  You would need to wrap the string into a ref'd range.
 That is not a good proposition.  Unfortunately, the way IFTI 
works, there isn't an opportunity to affect the parameter type 
IFTI decides to use.


[SNIP]

-Steve


If you want *do* ref behavior, I still don't see why you we don't 
just do it the algorithm way of return by value:



Tuple!(uint, R)
formattedRead2(R, Char, S...)(R r, const(Char)[] fmt, S args)
{
auto ret = formattedRead(r, fmt, args);
return Tuple!(uint, R)(ret, r);
}

void main()
{
  string s = 42 worlds;
  int v;
  s = formattedRead(s.save, %d, v)[1];
  writefln([%s][%s], v, s);
}




Re: Formatted read consumes input

2012-09-07 Thread Steven Schveighoffer
On Fri, 07 Sep 2012 10:35:37 -0400, monarch_dodra monarchdo...@gmail.com  
wrote:



On Friday, 7 September 2012 at 13:58:43 UTC, Steven Schveighoffer wrote:

On Thu, 23 Aug 2012 07:33:13 -0400, monarch_dodra

The only issue is, what if you *do* want ref behavior for strings?  You  
would need to wrap the string into a ref'd range.
 That is not a good proposition.  Unfortunately, the way IFTI works,  
there isn't an opportunity to affect the parameter type IFTI decides to  
use.


[SNIP]

-Steve


If you want *do* ref behavior, I still don't see why you we don't just  
do it the algorithm way of return by value:



Tuple!(uint, R)
formattedRead2(R, Char, S...)(R r, const(Char)[] fmt, S args)
{
 auto ret = formattedRead(r, fmt, args);
 return Tuple!(uint, R)(ret, r);
}

void main()
{
   string s = 42 worlds;
   int v;
   s = formattedRead(s.save, %d, v)[1];
   writefln([%s][%s], v, s);
}




This looks ugly.  Returning a tuple and having to split the result is  
horrible, I hated dealing with that in C++ (and I even wrote stuff that  
returned pairs!)


Not only that, but there are possible ranges which may not be reassignable.

I'd rather have a way to wrap a string into a ref-based input range.

We have three situations:

1. input range is a ref type already (i.e. a class or a pImpl struct), no  
need to pass this by ref, just wastes cycles doing double dereference.

2. input range is a value type, and you want to preserve the original.
3. input range is a value type, and you want to update the original.

I'd like to see the library automatically make the right decision for 1,  
and give you some mechanism to choose between 2 and 3.  To preserve  
existing code, 3 should be the default.


-Steve


Re: Formatted read consumes input

2012-09-07 Thread Jonathan M Davis
On Friday, September 07, 2012 10:52:07 Steven Schveighoffer wrote:
 We have three situations:
 
 1. input range is a ref type already (i.e. a class or a pImpl struct), no
 need to pass this by ref, just wastes cycles doing double dereference.
 2. input range is a value type, and you want to preserve the original.
 3. input range is a value type, and you want to update the original.
 
 I'd like to see the library automatically make the right decision for 1,
 and give you some mechanism to choose between 2 and 3.  To preserve
 existing code, 3 should be the default.

Does it _ever_ make sense for a range to be an input range and not a forward 
range and _not_ have it be a reference type? Since it would be implicitly 
saving it if it were a value type, it would then make sense that it should 
have save on it. So, I don't think that input ranges which aren't forward 
ranges make any sense unless they're reference types, in which case, there's 
no point in taking them by ref, and you _can't_ preserve the original.

- Jonathan M Davis


Re: Formatted read consumes input

2012-09-07 Thread monarch_dodra
On Friday, 7 September 2012 at 14:51:45 UTC, Steven Schveighoffer 
wrote:

On Fri, 07 Sep 2012 10:35:37 -0400, monarch_dodra

This looks ugly.  Returning a tuple and having to split the 
result is horrible, I hated dealing with that in C++ (and I 
even wrote stuff that returned pairs!)


Not only that, but there are possible ranges which may not be 
reassignable.


I'd rather have a way to wrap a string into a ref-based input 
range.


We have three situations:

1. input range is a ref type already (i.e. a class or a pImpl 
struct), no need to pass this by ref, just wastes cycles doing 
double dereference.
2. input range is a value type, and you want to preserve the 
original.
3. input range is a value type, and you want to update the 
original.


I'd like to see the library automatically make the right 
decision for 1, and give you some mechanism to choose between 2 
and 3.  To preserve existing code, 3 should be the default.


-Steve


True...

Still, I find it horrible to have to create a named dummy 
variable just when I simply want to pass a copy of my range.


I think I found 2 other solutions:
1: auto ref.
2: Kind of like auto ref: Just provide a non-ref overload. This 
creates less executable bloat.


Like this:

//Formatted read for R-Value input range.
uint formattedRead(R, Char, S...)(R r, const(Char)[] fmt, S args)
{
return formattedRead(r, fmt, args);
}
//Standard formated read
uint formattedRead(R, Char, S...)(ref R r, const(Char)[] fmt, S 
args)


This allows me to write, as I would expect:


void main()
{
  string s = x42xT;
  int v;
  formattedRead(s.save, x%dx, v); //Pyssing a copy
  writefln([%s][%s], v, s);
  formattedRead(s, x%dx, v); //Please consusme me
  writefln([%s][%s], v, s);
}

[42][x42xT] //My range is unchanged
[42][T] //My range was consumed


I think this is a good solution. Do you see anything I may have 
failed to see?


Re: Formatted read consumes input

2012-09-07 Thread monarch_dodra

On Friday, 7 September 2012 at 15:34:12 UTC, monarch_dodra wrote:
I think this is a good solution. Do you see anything I may have 
failed to see?


I've made a pull request out of it.

https://github.com/D-Programming-Language/phobos/pull/777


Re: Formatted read consumes input

2012-09-07 Thread Steven Schveighoffer
On Fri, 07 Sep 2012 11:04:36 -0400, Jonathan M Davis jmdavisp...@gmx.com  
wrote:



On Friday, September 07, 2012 10:52:07 Steven Schveighoffer wrote:

We have three situations:

1. input range is a ref type already (i.e. a class or a pImpl struct),  
no

need to pass this by ref, just wastes cycles doing double dereference.
2. input range is a value type, and you want to preserve the original.
3. input range is a value type, and you want to update the original.

I'd like to see the library automatically make the right decision for 1,
and give you some mechanism to choose between 2 and 3.  To preserve
existing code, 3 should be the default.


Does it _ever_ make sense for a range to be an input range and not a  
forward

range and _not_ have it be a reference type?


No it doesn't.  That is case 1.

However, it's quite easy to forget to define save when your range really  
is a forward range.  I don't really know a good way to fix this.  To  
assume that an input-and-not-forward range has reference semantics is  
prone to inappropriate code compiling just fine.


Clearly we can say classes are easily defined as not needing ref.

-Steve


Re: Formatted read consumes input

2012-09-07 Thread Steven Schveighoffer
On Fri, 07 Sep 2012 11:34:28 -0400, monarch_dodra monarchdo...@gmail.com  
wrote:



On Friday, 7 September 2012 at 14:51:45 UTC, Steven Schveighoffer wrote:

On Fri, 07 Sep 2012 10:35:37 -0400, monarch_dodra

This looks ugly.  Returning a tuple and having to split the result is  
horrible, I hated dealing with that in C++ (and I even wrote stuff that  
returned pairs!)


Not only that, but there are possible ranges which may not be  
reassignable.


I'd rather have a way to wrap a string into a ref-based input range.

We have three situations:

1. input range is a ref type already (i.e. a class or a pImpl struct),  
no need to pass this by ref, just wastes cycles doing double  
dereference.

2. input range is a value type, and you want to preserve the original.
3. input range is a value type, and you want to update the original.

I'd like to see the library automatically make the right decision for  
1, and give you some mechanism to choose between 2 and 3.  To preserve  
existing code, 3 should be the default.


-Steve


True...

Still, I find it horrible to have to create a named dummy variable  
just when I simply want to pass a copy of my range.


I think I found 2 other solutions:
1: auto ref.
2: Kind of like auto ref: Just provide a non-ref overload. This creates  
less executable bloat.


Like this:

//Formatted read for R-Value input range.
uint formattedRead(R, Char, S...)(R r, const(Char)[] fmt, S args)
{
 return formattedRead(r, fmt, args);
}
//Standard formated read
uint formattedRead(R, Char, S...)(ref R r, const(Char)[] fmt, S args)

This allows me to write, as I would expect:


void main()
{
   string s = x42xT;
   int v;
   formattedRead(s.save, x%dx, v); //Pyssing a copy
   writefln([%s][%s], v, s);
   formattedRead(s, x%dx, v); //Please consusme me
   writefln([%s][%s], v, s);
}

[42][x42xT] //My range is unchanged
[42][T] //My range was consumed


I think this is a good solution. Do you see anything I may have failed  
to see?


Well, this does work.  But I don't like that the semantics depend on  
whether the value is an rvalue or not.


Note that even ranges that are true input ranges (i.e. a file) still  
consume their data, even as rvalues, there is no way around it.


-Steve


Re: Formatted read consumes input

2012-09-07 Thread monarch_dodra
On Friday, 7 September 2012 at 18:15:00 UTC, Steven Schveighoffer 
wrote:


Well, this does work.  But I don't like that the semantics 
depend on whether the value is an rvalue or not.


Note that even ranges that are true input ranges (i.e. a file) 
still consume their data, even as rvalues, there is no way 
around it.


-Steve


Yes, but that is another issue, it is a copy vs save semantic 
issue. In theory, one should assume that *even* with pass by 
value, if you want your range to not be consumed, you have to 
call save. Most ranges are value types, so we tend to forget 
it. std.algorithm had a few save-related bugs like that as a 
matter of fact.


But, contrary to post 1, that is not the actual issue being fixed 
here. It is merely a compile with unnamed fix:

formattedRead(file.save, ...)
And now it compiles fine. AND the range is saved. That's it. 
Nothing more, nothing less.


...

That's *if* file provides save. I do not know much about 
file/stream handling in D, but you get my save point.


Re: Formatted read consumes input

2012-08-24 Thread Dmitry Olshansky

On Thursday, 23 August 2012 at 11:33:19 UTC, monarch_dodra wrote:

As title implies:


import std.stdio;
import std.format;

void main()
{
  string s = 42;
  int v;
  formattedRead(s, %d, v);
  writefln([%s] [%s], s, v);
}

[] [42]


Is this the expected behavior?


Yes, both parse family and formattedRead are operating on ref 
argument. That means they modify in place. Also ponder the 
thought that 2 consecutive reads should obviously read first and 
2nd value in the string not the same one.




Furthermore, it is not possible to try to save s:

import std.stdio;
import std.format;
import std.range;

void main()
{
  string s = 42;
  int v;
  formattedRead(s.save, %d, v);
  writefln([%s] [%s], s, v);
}



Yes, because ref doesn't bind r-value.


The workaround is to have a named backup:
  auto ss = s.save;
  formattedRead(ss, %d, v);


I've traced the root issue to formattedRead's signature, which 
is:
uint formattedRead(R, Char, S...)(ref R r, const(Char)[] fmt, S 
args);




As I explained above the reason is because the only sane logic of 
multiple reads is to consume input and to do so it needs ref.


Is there a particular reason for this pass by ref? It is 
inconsistent with the rest of phobos, or even C's scanf?


C's scanf is a poor argument as it uses pointers instead of ref 
(and it can't do ref as there is no ref in C :) ). Yet it doesn't 
allow to read things in a couple of calls AFAIK. In C scanf 
returns number of arguments successfully read not bytes so there 
is no way to continue from where it stopped.


BTW it's not documented what formattedRead returns ... just ouch.


Re: Formatted read consumes input

2012-08-24 Thread monarch_dodra

On Friday, 24 August 2012 at 11:18:55 UTC, Dmitry Olshansky wrote:
On Thursday, 23 August 2012 at 11:33:19 UTC, monarch_dodra 
wrote:
I've traced the root issue to formattedRead's signature, which 
is:
uint formattedRead(R, Char, S...)(ref R r, const(Char)[] fmt, 
S args);




As I explained above the reason is because the only sane logic 
of multiple reads is to consume input and to do so it needs ref.


I had actually considered that argument. But a lot of algorithms 
have the same approach, yet they don't take refs, they *return* 
the consumed front:



R formattedRead(R, Char, S...)(R r, const(Char)[] fmt, S args)

auto s2 = formatedRead(s, %d, v);


Or arguably:


Tuple!(size_t, R) formattedRead(R, Char, S...)(R r, const(Char)[] 
fmt, S args)



minCount, boyerMooreFinder and levenshteinDistanceAndPath 
all take this approach to return a consumed range plus an 
index/count.


Re: Formatted read consumes input

2012-08-24 Thread Denis Shelomovskij

24.08.2012 16:16, monarch_dodra пишет:

On Friday, 24 August 2012 at 11:18:55 UTC, Dmitry Olshansky wrote:

On Thursday, 23 August 2012 at 11:33:19 UTC, monarch_dodra wrote:

I've traced the root issue to formattedRead's signature, which is:
uint formattedRead(R, Char, S...)(ref R r, const(Char)[] fmt, S args);



As I explained above the reason is because the only sane logic of
multiple reads is to consume input and to do so it needs ref.


I had actually considered that argument. But a lot of algorithms have
the same approach, yet they don't take refs, they *return* the consumed
front:


R formattedRead(R, Char, S...)(R r, const(Char)[] fmt, S args)

auto s2 = formatedRead(s, %d, v);


Or arguably:


Tuple!(size_t, R) formattedRead(R, Char, S...)(R r, const(Char)[] fmt, S
args)


minCount, boyerMooreFinder and levenshteinDistanceAndPath all take
this approach to return a consumed range plus an index/count.


It's because `formattedRead` is designed to work with an input range 
which isn't a forward range (not save-able).


--
Денис В. Шеломовский
Denis V. Shelomovskij


Re: Formatted read consumes input

2012-08-24 Thread Tove

On Friday, 24 August 2012 at 11:18:55 UTC, Dmitry Olshansky wrote:
C's scanf is a poor argument as it uses pointers instead of ref 
(and it can't do ref as there is no ref in C :) ). Yet it 
doesn't allow to read things in a couple of calls AFAIK. In C 
scanf returns number of arguments successfully read not bytes 
so there is no way to continue from where it stopped.


BTW it's not documented what formattedRead returns ... just 
ouch.


Actually... look up %n in sscanf it's wonderful, I use it all 
the time.




Re: Formatted read consumes input

2012-08-24 Thread monarch_dodra
On Friday, 24 August 2012 at 13:08:43 UTC, Denis Shelomovskij 
wrote:

24.08.2012 16:16, monarch_dodra пишет:
On Friday, 24 August 2012 at 11:18:55 UTC, Dmitry Olshansky 
wrote:
On Thursday, 23 August 2012 at 11:33:19 UTC, monarch_dodra 
wrote:
I've traced the root issue to formattedRead's signature, 
which is:
uint formattedRead(R, Char, S...)(ref R r, const(Char)[] 
fmt, S args);




As I explained above the reason is because the only sane 
logic of

multiple reads is to consume input and to do so it needs ref.


I had actually considered that argument. But a lot of 
algorithms have
the same approach, yet they don't take refs, they *return* the 
consumed

front:


R formattedRead(R, Char, S...)(R r, const(Char)[] fmt, S args)

auto s2 = formatedRead(s, %d, v);


Or arguably:


Tuple!(size_t, R) formattedRead(R, Char, S...)(R r, 
const(Char)[] fmt, S

args)


minCount, boyerMooreFinder and 
levenshteinDistanceAndPath all take

this approach to return a consumed range plus an index/count.


It's because `formattedRead` is designed to work with an input 
range which isn't a forward range (not save-able).


You had me ready to throw in the towel on that argument, but 
thinking harder about it, that doesn't really change anything 
actually:


At the end of formattedRead, the passed range has a certain 
state. whether you give this range back to the caller via pass 
by ref or return by value has nothing to do with save-ability.


Re: Formatted read consumes input

2012-08-24 Thread Dmitry Olshansky

On 24-Aug-12 17:43, Tove wrote:

On Friday, 24 August 2012 at 11:18:55 UTC, Dmitry Olshansky wrote:

C's scanf is a poor argument as it uses pointers instead of ref (and
it can't do ref as there is no ref in C :) ). Yet it doesn't allow to
read things in a couple of calls AFAIK. In C scanf returns number of
arguments successfully read not bytes so there is no way to continue
from where it stopped.

BTW it's not documented what formattedRead returns ... just ouch.


Actually... look up %n in sscanf it's wonderful, I use it all the time.


God... what an awful kludge :)

--
Olshansky Dmitry


Formatted read consumes input

2012-08-23 Thread monarch_dodra

As title implies:


import std.stdio;
import std.format;

void main()
{
  string s = 42;
  int v;
  formattedRead(s, %d, v);
  writefln([%s] [%s], s, v);
}

[] [42]


Is this the expected behavior?

Furthermore, it is not possible to try to save s:

import std.stdio;
import std.format;
import std.range;

void main()
{
  string s = 42;
  int v;
  formattedRead(s.save, %d, v);
  writefln([%s] [%s], s, v);
}

main.d(9): Error: template std.format.formattedRead does not 
match any function template declaration
C:\D\dmd.2.060\dmd2\windows\bin\..\..\src\phobos\std\format.d(526): 
Error: template std.format.formattedRead(R,Char,S...) cannot 
deduce template function from argument types 
!()(string,string,int*)



The workaround is to have a named backup:
  auto ss = s.save;
  formattedRead(ss, %d, v);


I've traced the root issue to formattedRead's signature, which is:
uint formattedRead(R, Char, S...)(ref R r, const(Char)[] fmt, S 
args);


Is there a particular reason for this pass by ref? It is 
inconsistent with the rest of phobos, or even C's scanf?


Is this a file-able bug_report/enhancement_request?