Re: std.string.indexOf with an optional start-at parameter?

2011-04-04 Thread Steven Schveighoffer
On Mon, 04 Apr 2011 10:10:00 -0400, Andrei Alexandrescu  
 wrote:



On 4/4/11 8:18 AM, Steven Schveighoffer wrote:

On Sun, 03 Apr 2011 14:24:33 -0400, spir  wrote:


On 04/03/2011 07:39 PM, Aleksandar Ružičić wrote:

I needed std.string.indexOf to accept start position in the string to
start the search at. I was really surprised when I realized that this
(to me) standard parameter is "missing" (I'm used to indexOf in
javascript, strpos in php and equivalent methods in other languages,
which support start offset parameter).

There might be some other function (in some other module) that does
what I want but I wasn't able to find it (I find D's documentation not
easy to search and read), so I've copied indexOf to my module and
added wanted functionality:

https://gist.github.com/900589

now, I'm able to write, for example:

auto pos = indexOf(haystack, '$', 10); // will starts search at 11th
char in haystack

and

auto pos = indexOf(haystack, '$', -5); // will starts search at 5th
char from the end

My question is: is there a reason why there is no this functionality
in phobos (maybe there's some language feature I'm not aware of?) and
if no such reason exists, would it be possible to add it in future
version of phobos/dmd?


Agreed this is a fairly standard param in other languages, but D
easily (and rather cheaply) allows
auto pos = indexOf(s[i..$], char);


That doesn't work, because it gets you the position in relation to
s[i..$], whereas you want the position in relation to s.

I think the requested feature is common enough to warrant inclusion,
especially since it could take care of out-of-bounds problems where
slicing would throw an error instead. To write the equivalent would be
very non-trivial.

-Steve


I'm worried that most people will want and mean n in indexOf(haystack,  
needle, n) as "start from the nth character from the front (or back)".  
Then we need the slower algorithm. Using a slice clarifies the intent on  
the caller's side.


I was about to write "what does that mean?", but I see what you mean now.

I look at the most important part of this as, when you search on a slice,  
the result is a number that has to be re-adjusted based on the slice.


That is, if I could get the right value by doing indexOf(s[5..$], "abc"),  
then it would be great to just accept slices, but that returns the result  
offset by the slice.


So you have to re-adjust the return value: indexOf(s[5..$], "abc") + 5.

This is a little annoying, and dangerous, especially if the offset (5 in  
this example) is not a simple literal.  There are very very easy ways to  
write this incorrectly.


What if we had a different function name to re-assert intent?   
indexOfSlice(haystack, needle, slicestart, sliceend) where sliceend was  
optional?


-Steve


Re: std.string.indexOf with an optional start-at parameter?

2011-04-04 Thread Andrei Alexandrescu

On 4/4/11 8:18 AM, Steven Schveighoffer wrote:

On Sun, 03 Apr 2011 14:24:33 -0400, spir  wrote:


On 04/03/2011 07:39 PM, Aleksandar Ružičić wrote:

I needed std.string.indexOf to accept start position in the string to
start the search at. I was really surprised when I realized that this
(to me) standard parameter is "missing" (I'm used to indexOf in
javascript, strpos in php and equivalent methods in other languages,
which support start offset parameter).

There might be some other function (in some other module) that does
what I want but I wasn't able to find it (I find D's documentation not
easy to search and read), so I've copied indexOf to my module and
added wanted functionality:

https://gist.github.com/900589

now, I'm able to write, for example:

auto pos = indexOf(haystack, '$', 10); // will starts search at 11th
char in haystack

and

auto pos = indexOf(haystack, '$', -5); // will starts search at 5th
char from the end

My question is: is there a reason why there is no this functionality
in phobos (maybe there's some language feature I'm not aware of?) and
if no such reason exists, would it be possible to add it in future
version of phobos/dmd?


Agreed this is a fairly standard param in other languages, but D
easily (and rather cheaply) allows
auto pos = indexOf(s[i..$], char);


That doesn't work, because it gets you the position in relation to
s[i..$], whereas you want the position in relation to s.

I think the requested feature is common enough to warrant inclusion,
especially since it could take care of out-of-bounds problems where
slicing would throw an error instead. To write the equivalent would be
very non-trivial.

-Steve


I'm worried that most people will want and mean n in indexOf(haystack, 
needle, n) as "start from the nth character from the front (or back)". 
Then we need the slower algorithm. Using a slice clarifies the intent on 
the caller's side.



Andrei


Re: std.string.indexOf with an optional start-at parameter?

2011-04-04 Thread Aleksandar Ružičić
On Mon, Apr 4, 2011 at 3:18 PM, Steven Schveighoffer
 wrote:
> On Sun, 03 Apr 2011 14:24:33 -0400, spir  wrote:
>> Agreed this is a fairly standard param in other languages, but D easily
>> (and rather cheaply) allows
>>     auto pos = indexOf(s[i..$], char);
>
> That doesn't work, because it gets you the position in relation to s[i..$],
> whereas you want the position in relation to s.
>

This works:
---
sizediff_t
indexOf(Char, T = sizediff_t)(in Char[] s, dchar c, T sp = 0)
if (isSomeChar!Char)
{
if (sp < 0)
sp += s.length;

if (sp >= 0 && sp < s.length)
{
auto i = indexOf(s[sp..$], c);
if (i > -1)
return i + sp;
}
return -1;
}
---

And it doesn't require change of existing indexOf.


Re: std.string.indexOf with an optional start-at parameter?

2011-04-04 Thread Steven Schveighoffer

On Sun, 03 Apr 2011 14:24:33 -0400, spir  wrote:


On 04/03/2011 07:39 PM, Aleksandar Ružičić wrote:

I needed std.string.indexOf to accept start position in the string to
start the search at. I was really surprised when I realized that this
(to me) standard parameter is "missing" (I'm used to indexOf in
javascript, strpos in php and equivalent methods in other languages,
which support start offset parameter).

There might be some other function (in some other module) that does
what I want but I wasn't able to find it (I find D's documentation not
easy to search and read), so I've copied indexOf to my module and
added wanted functionality:

https://gist.github.com/900589

now, I'm able to write, for example:

auto pos = indexOf(haystack, '$', 10); // will starts search at 11th
char in haystack

and

auto pos = indexOf(haystack, '$', -5); // will starts search at 5th
char from the end

My question is: is there a reason why there is no this functionality
in phobos (maybe there's some language feature I'm not aware of?) and
if no such reason exists, would it be possible to add it in future
version of phobos/dmd?


Agreed this is a fairly standard param in other languages, but D easily  
(and rather cheaply) allows

 auto pos = indexOf(s[i..$], char);


That doesn't work, because it gets you the position in relation to  
s[i..$], whereas you want the position in relation to s.


I think the requested feature is common enough to warrant inclusion,  
especially since it could take care of out-of-bounds problems where  
slicing would throw an error instead.  To write the equivalent would be  
very non-trivial.


-Steve


Re: std.string.indexOf with an optional start-at parameter?

2011-04-03 Thread Aleksandar Ružičić
On Sun, Apr 3, 2011 at 10:56 PM, KennyTM~  wrote:
>> And javascript _does_ have true arrays, but it _doesn't_ have true
>> associative arrays (those are object literals).
>>
>
> I would not call it a true array if it is indexed by string internally.
> Anyway, this is not the main point.
>

You're right that JS arrays are not a point here, but I must again
disagree with you :)
I write Javascript code for living, so I think I know what I'm talking about:

var a = ["foo", "bar", "baz"];  // defining an array
a[0];// "foo"
a["0"];  // this would also work, but only because JS casts "0" to
integer implicitly

there is no string indexing with arrays, only with objects
(associative arrays, maps, call it as you like):

var o = {foo: "bar", 0: "baz"};  // defining an object (a.k.a AA)
o.foo;  // "bar"
o["foo"];  // same, returns "bar"
o[0];   // "baz"   now, this just looks like indexing an array,
but it really ain't, it's property getter, but JS allows you have
numeric properties so it can be confusing, I admit.

That's all I have to say about JS arrays, won't be talking non-D anymore :)

Regards,
Aleksandar


Re: std.string.indexOf with an optional start-at parameter?

2011-04-03 Thread KennyTM~

On Apr 4, 11 04:07, Aleksandar Ružičić wrote:

You mean Python and Ruby.

  - Javascript does not support negative index. In fact, JS has no true
arrays, it only has associative array.
  - PHP does not support negative index. http://ideone.com/8MZ2T


I was talking about javascript's String.prototype.indexOf () and php's
strpos functions, not about array indexing.


I see.


But even for that I wasn't correct :/. Negative start-at index is
avaliable for substr (both, in php and js), that's why I have confused
it with indexOf (I thought these things are consistent..)



PHP will never be consistent. ;)


And javascript _does_ have true arrays, but it _doesn't_ have true
associative arrays (those are object literals).



I would not call it a true array if it is indexed by string internally. 
Anyway, this is not the main point.



This does not mean negative index is useless (I use it all the time when
programming in Python), but D shouldn't add a feature just because other
languages have it, or even you think that language had it.


I know, I was just expressing my opinion (what I would like to see in
a language, I never programmed in phyton or perl, so I was thinking
that negative indices for array indexing are not supported in any
language that I know of), I wasn't proposing a new feature :)


Right.


Re: std.string.indexOf with an optional start-at parameter?

2011-04-03 Thread Aleksandar Ružičić
> You mean Python and Ruby.
>
>  - Javascript does not support negative index. In fact, JS has no true
> arrays, it only has associative array.
>  - PHP does not support negative index. http://ideone.com/8MZ2T

I was talking about javascript's String.prototype.indexOf () and php's
strpos functions, not about array indexing.
But even for that I wasn't correct :/. Negative start-at index is
avaliable for substr (both, in php and js), that's why I have confused
it with indexOf (I thought these things are consistent..)

And javascript _does_ have true arrays, but it _doesn't_ have true
associative arrays (those are object literals).

> This does not mean negative index is useless (I use it all the time when
> programming in Python), but D shouldn't add a feature just because other
> languages have it, or even you think that language had it.

I know, I was just expressing my opinion (what I would like to see in
a language, I never programmed in phyton or perl, so I was thinking
that negative indices for array indexing are not supported in any
language that I know of), I wasn't proposing a new feature :)


Re: std.string.indexOf with an optional start-at parameter?

2011-04-03 Thread KennyTM~

On Apr 4, 11 02:29, Aleksandar Ružičić wrote:

On Sun, Apr 3, 2011 at 8:16 PM, Andrei Alexandrescu
  wrote:

It's not.


Seems I've missed that in the docs, I tought it will always make a copy :)


I think that's a natural and simple improvement of indexOf. The one aspect
that I'm unsure about is starting from the end for negative indices.


Negative indices might seem a bit odd but it's standard in other
languages (like javascript and php which I've already mentioned).
I would even like to see this in D:



You mean Python and Ruby.

 - Javascript does not support negative index. In fact, JS has no true 
arrays, it only has associative array.

 - PHP does not support negative index. http://ideone.com/8MZ2T

Many other languages that I've heard of like C#, C, C++, Go, Haskell and 
Java also do not support negative index.


Also, interestingly, Perl 5 had negative index, but Perl 6 killed it. 
(http://perlcabal.org/syn/S09.html#Negative_and_differential_subscripts)


"The Perl 6 semantics avoids indexing discontinuities (a source of
 subtle runtime errors), and provides ordinal access in both
 directions at both ends of the array."

This does not mean negative index is useless (I use it all the time when 
programming in Python), but D shouldn't add a feature just because other 
languages have it, or even you think that language had it.



array[-2];  // get 'a' from "foobar"

same for slicing:

array[-4..2];  // get "ob" from "foobar"


Could you please submit an enhancement request to bugzilla?


sure!




Re: std.string.indexOf with an optional start-at parameter?

2011-04-03 Thread KennyTM~

On Apr 4, 11 02:24, spir wrote:

On 04/03/2011 07:39 PM, Aleksandar Ružičić wrote:

I needed std.string.indexOf to accept start position in the string to
start the search at. I was really surprised when I realized that this
(to me) standard parameter is "missing" (I'm used to indexOf in
javascript, strpos in php and equivalent methods in other languages,
which support start offset parameter).

There might be some other function (in some other module) that does
what I want but I wasn't able to find it (I find D's documentation not
easy to search and read), so I've copied indexOf to my module and
added wanted functionality:

https://gist.github.com/900589

now, I'm able to write, for example:

auto pos = indexOf(haystack, '$', 10); // will starts search at 11th
char in haystack

and

auto pos = indexOf(haystack, '$', -5); // will starts search at 5th
char from the end

My question is: is there a reason why there is no this functionality
in phobos (maybe there's some language feature I'm not aware of?) and
if no such reason exists, would it be possible to add it in future
version of phobos/dmd?


Agreed this is a fairly standard param in other languages, but D easily
(and rather cheaply) allows
auto pos = indexOf(s[i..$], char);

I would be far more interested in generalised negative inices in D --à
la Python. A great step for D's friendliness, and an final end to
current '$' issues (1).

Denis

(1) Which works only for builtin type via compiler magic, because it
neither maps to .length, nore is overloadable.


There should be opDollar http://d.puremagic.com/issues/show_bug.cgi?id=3474.


Re: std.string.indexOf with an optional start-at parameter?

2011-04-03 Thread Aleksandar Ružičić
On Sun, Apr 3, 2011 at 8:16 PM, Andrei Alexandrescu
 wrote:
> It's not.

Seems I've missed that in the docs, I tought it will always make a copy :)

> I think that's a natural and simple improvement of indexOf. The one aspect
> that I'm unsure about is starting from the end for negative indices.

Negative indices might seem a bit odd but it's standard in other
languages (like javascript and php which I've already mentioned).
I would even like to see this in D:

array[-2];  // get 'a' from "foobar"

same for slicing:

array[-4..2];  // get "ob" from "foobar"

> Could you please submit an enhancement request to bugzilla?

sure!


Re: std.string.indexOf with an optional start-at parameter?

2011-04-03 Thread spir

On 04/03/2011 07:39 PM, Aleksandar Ružičić wrote:

I needed std.string.indexOf to accept start position in the string to
start the search at. I was really surprised when I realized that this
(to me) standard parameter is "missing" (I'm used to indexOf in
javascript, strpos in php and equivalent methods in other languages,
which support start offset parameter).

There might be some other function (in some other module) that does
what I want but I wasn't able to find it (I find D's documentation not
easy to search and read), so I've copied indexOf to my module and
added wanted functionality:

https://gist.github.com/900589

now, I'm able to write, for example:

auto pos = indexOf(haystack, '$', 10); // will starts search at 11th
char in haystack

and

auto pos = indexOf(haystack, '$', -5); // will starts search at 5th
char from the end

My question is: is there a reason why there is no this functionality
in phobos (maybe there's some language feature I'm not aware of?) and
if no such reason exists, would it be possible to add it in future
version of phobos/dmd?


Agreed this is a fairly standard param in other languages, but D easily (and 
rather cheaply) allows

auto pos = indexOf(s[i..$], char);

I would be far more interested in generalised negative inices in D --à la 
Python. A great step for D's friendliness, and an final end to current '$' 
issues (1).


Denis

(1) Which works only for builtin type via compiler magic, because it neither 
maps to .length, nore is overloadable.

--
_
vita es estrany
spir.wikidot.com



Re: std.string.indexOf with an optional start-at parameter?

2011-04-03 Thread Andrei Alexandrescu

On 4/3/11 1:14 PM, Aleksandar Ružičić wrote:

I thought first of slicing, but isn't that making a copy of string?


It's not.


And also, if I'm not mistaken if you slice out of range bounds (i.e.
haystack[5..$] when haystack.length<  5) you'll get exception, right?


Correct.


That's why I think this would be nice to have feature, so you don't
have to worry if start position is within the string bounds, and you
won't need to write this:


I think that's a natural and simple improvement of indexOf. The one 
aspect that I'm unsure about is starting from the end for negative indices.


Could you please submit an enhancement request to bugzilla?


Andrei


Re: std.string.indexOf with an optional start-at parameter?

2011-04-03 Thread Aleksandar Ružičić
I thought first of slicing, but isn't that making a copy of string?
And also, if I'm not mistaken if you slice out of range bounds (i.e.
haystack[5..$] when haystack.length < 5) you'll get exception, right?

That's why I think this would be nice to have feature, so you don't
have to worry if start position is within the string bounds, and you
won't need to write this:

> auto pos = indexOf(haystack[$-5..$], '$') + haystack.length-5;

when you want to start search from the end (since it's somehow less
readable than indexOf(haystack, '$', -5)).

On Sun, Apr 3, 2011 at 7:55 PM, Robert Jacques  wrote:
> On Sun, 03 Apr 2011 13:39:40 -0400, Aleksandar Ružičić
>  wrote:
>
>> I needed std.string.indexOf to accept start position in the string to
>> start the search at. I was really surprised when I realized that this
>> (to me) standard parameter is "missing" (I'm used to indexOf in
>> javascript, strpos in php and equivalent methods in other languages,
>> which support start offset parameter).
>>
>> There might be some other function (in some other module) that does
>> what I want but I wasn't able to find it (I find D's documentation not
>> easy to search and read), so I've copied indexOf to my module and
>> added wanted functionality:
>>
>> https://gist.github.com/900589
>>
>> now, I'm able to write, for example:
>>
>> auto pos = indexOf(haystack, '$', 10); // will starts search at 11th
>> char in haystack
>
> auto pos = indexOf(haystack[10..$], '$') + 10;
>
>> and
>>
>> auto pos = indexOf(haystack, '$', -5); // will starts search at 5th
>> char from the end
>
> auto pos = indexOf(haystack[$-5..$], '$') + haystack.length-5;
>
>> My question is: is there a reason why there is no this functionality
>> in phobos (maybe there's some language feature I'm not aware of?) and
>> if no such reason exists, would it be possible to add it in future
>> version of phobos/dmd?
>
> Yes, the language feature is called slicing. See above. Also, you may want
> to look at the various find methods in std.algorithm. Generally, it's better
> to work with ranges/slices than indexes due to UTF's encoding scheme.
>


Re: std.string.indexOf with an optional start-at parameter?

2011-04-03 Thread Robert Jacques
On Sun, 03 Apr 2011 13:39:40 -0400, Aleksandar Ružičić  
 wrote:



I needed std.string.indexOf to accept start position in the string to
start the search at. I was really surprised when I realized that this
(to me) standard parameter is "missing" (I'm used to indexOf in
javascript, strpos in php and equivalent methods in other languages,
which support start offset parameter).

There might be some other function (in some other module) that does
what I want but I wasn't able to find it (I find D's documentation not
easy to search and read), so I've copied indexOf to my module and
added wanted functionality:

https://gist.github.com/900589

now, I'm able to write, for example:

auto pos = indexOf(haystack, '$', 10); // will starts search at 11th
char in haystack


auto pos = indexOf(haystack[10..$], '$') + 10;


and

auto pos = indexOf(haystack, '$', -5); // will starts search at 5th
char from the end


auto pos = indexOf(haystack[$-5..$], '$') + haystack.length-5;


My question is: is there a reason why there is no this functionality
in phobos (maybe there's some language feature I'm not aware of?) and
if no such reason exists, would it be possible to add it in future
version of phobos/dmd?


Yes, the language feature is called slicing. See above. Also, you may want  
to look at the various find methods in std.algorithm. Generally, it's  
better to work with ranges/slices than indexes due to UTF's encoding  
scheme.


std.string.indexOf with an optional start-at parameter?

2011-04-03 Thread Aleksandar Ružičić
I needed std.string.indexOf to accept start position in the string to
start the search at. I was really surprised when I realized that this
(to me) standard parameter is "missing" (I'm used to indexOf in
javascript, strpos in php and equivalent methods in other languages,
which support start offset parameter).

There might be some other function (in some other module) that does
what I want but I wasn't able to find it (I find D's documentation not
easy to search and read), so I've copied indexOf to my module and
added wanted functionality:

https://gist.github.com/900589

now, I'm able to write, for example:

auto pos = indexOf(haystack, '$', 10); // will starts search at 11th
char in haystack

and

auto pos = indexOf(haystack, '$', -5); // will starts search at 5th
char from the end

My question is: is there a reason why there is no this functionality
in phobos (maybe there's some language feature I'm not aware of?) and
if no such reason exists, would it be possible to add it in future
version of phobos/dmd?