Re: boolean over multiple variables

2010-01-26 Thread Bill Baxter
On Tue, Jan 26, 2010 at 6:16 PM, Nick Sabalausky  wrote:
>>"Bill Baxter"  wrote in message
>>news:mailman.34.1264542189.4461.digitalmars-d-le...@puremagic.com...
>>On Tue, Jan 26, 2010 at 1:21 PM, bearophile 
>>wrote:
>>> Nick Sabalausky:
 Aside from that being how Python does it, why do you see that as
 preferable?
>>>
>>> Because:
>>> 1) linear searches in an array are damn common. I don't remember the
>>> results of my benchmarks, but until your integer arrays is quite longer
>>> than 30-50 items, performing a linear search is faster than a lookup in
>>> an AA, on DMD. On Tango this number is probably 70% higher
>>> 1b) In Python if you perform a "foo" in "barfoo" the language doesn't
>>> perform a linear search, it uses a much smarter search that has a
>>> complexity lower than the product of the two lengths, using a custom
>>> algorithm. So in D you can use the same syntax to search for
>>> substrings/subarrays. Where such smarter search is not possible, D can
>>> use a naive search.
>>> 2) It's really handy. I use isIn(item, items) to search on arrays in D,
>>> but having a item in items is nicer.
>>> 3) You can use the same syntax to search into anything that's lazily
>>> iterable too (a Range). This is very handy.
>>>
>>>
 So having a single syntax work on the outputs for
 regular arrays, but then on the inputs for AAs, seems highly
 inconsistent
 and error-prone to me.
>>>
>>> I have followed many Python newbies personally, I am following the Python
>>> newsgroups, and I have programmed for years in Python, and while I have
>>> seen many different kinds of bugs, I have not seen a significant amount
>>> of bugs in this. Python programmers just learn that dicts and lists are a
>>> little different in this regard. At the same way they learn that a set
>>> and a dict are different data structures, with different capabilities and
>>> usages.
>>
>>It's not even really  inconsistent if you just think about these data
>>structures in terms of function rather than form.
>>An array is often used as a simple set of things.  "O in Array" means
>>"is O in that set of things"
>>An AA is a set of things that also have some associated data.  "O in
>>AA" means "is O in that set of things" (not the ancillary data)
>>If you have an actual "set" data structure for containing a set of of
>>things, then "O in Set" means, again, "is O in that set of things".
>>(In fact the closest thing D has to a built-in set type is an AA with
>>"don't care" associated data, reinforcing the notion of AA as a set
>>plus extra data.)
>>
>>
>
> Even looking at function rather than form, I still think its innacurate to
> consider the keys to be the elements of an AA. In most uses of an AA, the
> key is primarily something convenient with which to look up data. They hold
> significance, but typically not as much as the data that is looked up with
> it. What you've described is very much like (and quite literally the same
> as, in the case of many dynamic languages) thinking of a variable's name as
> the actual data, and thinking of the value it holds merely as "ancillary
> data".
>
> Keep in mind too, even with a regular array, the index can still hold
> significance as data. For instace, I could have an array of Foo's, declare
> that any element with an odd index has property 'A' and any with an even
> index has property 'B', and treat them as such. May seem strange at a
> glance, but such things are common in low-level, low-resoruce and
> performance-oriented code. Bottom line, though, is that "Property 'A' or
> 'B'" is data that now been encoded in the array's index, but despite that,
> the indicies still aren't considered the array's elements. And the data they
> lookup still isn't considered "ancillary data".
>
> And yes, a Hashed Set may likely be *implemented* as an AA with just keys,
> but that's just form, it doesn't imply a similarity in regard to function.
> The *function* of a HashSet is that of an unordered array that's been
> optimized for "contains X / doesn't contain X" on large data sets.

> Containers and their functions:
> - AA: Store A's with label B, A is fairly important, B may or may not be.
> - Array: Store A's with label B, A is fairly important, B may or may not be,
> B has certain restrictions, container overall has different performance
> characteristacs from an AA.
> - Hashed Set: Store A's.


All I am trying to say is that there are multiple ways of looking at
the functionality offered by different containers.
And there exists a way of looking at it where the Python-style 'in'
operator can be seen as behaving consistently.

You asserted it is inconsistent.  I'm just saying it's only
inconsistent if you insist that one particular way of looking at the
containers is the "right" way.

--bb


Re: boolean over multiple variables

2010-01-26 Thread Rainer Deyke
bearophile wrote:
> Nick Sabalausky:
>> Aside from that being how Python does it, why do you see that as
>> preferable?
> 
> Because: 1) linear searches in an array are damn common. I don't
> remember the results of my benchmarks, but until your integer arrays
> is quite longer than 30-50 items, performing a linear search is
> faster than a lookup in an AA, on DMD. On Tango this number is
> probably 70% higher 1b) In Python if you perform a "foo" in "barfoo"
> the language doesn't perform a linear search, it uses a much smarter
> search that has a complexity lower than the product of the two
> lengths, using a custom algorithm. So in D you can use the same
> syntax to search for substrings/subarrays. Where such smarter search
> is not possible, D can use a naive search. 2) It's really handy. I
> use isIn(item, items) to search on arrays in D, but having a item in
> items is nicer. 3) You can use the same syntax to search into
> anything that's lazily iterable too (a Range). This is very handy.

I would add to that:
4) Because 'in' is an operator, and operators are expected to bear a
greater weight than ordinary functions.

If 'in' was an ordinary method, say 'a.contains(b)', then I would choose
different method names for searching an array for a value and searching
an associative array for a key.  Probably something like:
  array.contains(value)
  associative_array.containsKey(key)

However, since 'in' already is an infix operator, it should have the
most widely applicable semantics.  Operators are heavyweight syntactic
sugar for function calls.  There is no room in D for operators that are
only rarely useful.


-- 
Rainer Deyke - rain...@eldwood.com


Re: boolean over multiple variables

2010-01-26 Thread Nick Sabalausky
>"Bill Baxter"  wrote in message 
>news:mailman.34.1264542189.4461.digitalmars-d-le...@puremagic.com...
>On Tue, Jan 26, 2010 at 1:21 PM, bearophile  
>wrote:
>> Nick Sabalausky:
>>> Aside from that being how Python does it, why do you see that as 
>>> preferable?
>>
>> Because:
>> 1) linear searches in an array are damn common. I don't remember the 
>> results of my benchmarks, but until your integer arrays is quite longer 
>> than 30-50 items, performing a linear search is faster than a lookup in 
>> an AA, on DMD. On Tango this number is probably 70% higher
>> 1b) In Python if you perform a "foo" in "barfoo" the language doesn't 
>> perform a linear search, it uses a much smarter search that has a 
>> complexity lower than the product of the two lengths, using a custom 
>> algorithm. So in D you can use the same syntax to search for 
>> substrings/subarrays. Where such smarter search is not possible, D can 
>> use a naive search.
>> 2) It's really handy. I use isIn(item, items) to search on arrays in D, 
>> but having a item in items is nicer.
>> 3) You can use the same syntax to search into anything that's lazily 
>> iterable too (a Range). This is very handy.
>>
>>
>>> So having a single syntax work on the outputs for
>>> regular arrays, but then on the inputs for AAs, seems highly 
>>> inconsistent
>>> and error-prone to me.
>>
>> I have followed many Python newbies personally, I am following the Python 
>> newsgroups, and I have programmed for years in Python, and while I have 
>> seen many different kinds of bugs, I have not seen a significant amount 
>> of bugs in this. Python programmers just learn that dicts and lists are a 
>> little different in this regard. At the same way they learn that a set 
>> and a dict are different data structures, with different capabilities and 
>> usages.
>
>It's not even really  inconsistent if you just think about these data
>structures in terms of function rather than form.
>An array is often used as a simple set of things.  "O in Array" means
>"is O in that set of things"
>An AA is a set of things that also have some associated data.  "O in
>AA" means "is O in that set of things" (not the ancillary data)
>If you have an actual "set" data structure for containing a set of of
>things, then "O in Set" means, again, "is O in that set of things".
>(In fact the closest thing D has to a built-in set type is an AA with
>"don't care" associated data, reinforcing the notion of AA as a set
>plus extra data.)
>
>

Even looking at function rather than form, I still think its innacurate to 
consider the keys to be the elements of an AA. In most uses of an AA, the 
key is primarily something convenient with which to look up data. They hold 
significance, but typically not as much as the data that is looked up with 
it. What you've described is very much like (and quite literally the same 
as, in the case of many dynamic languages) thinking of a variable's name as 
the actual data, and thinking of the value it holds merely as "ancillary 
data".

Keep in mind too, even with a regular array, the index can still hold 
significance as data. For instace, I could have an array of Foo's, declare 
that any element with an odd index has property 'A' and any with an even 
index has property 'B', and treat them as such. May seem strange at a 
glance, but such things are common in low-level, low-resoruce and 
performance-oriented code. Bottom line, though, is that "Property 'A' or 
'B'" is data that now been encoded in the array's index, but despite that, 
the indicies still aren't considered the array's elements. And the data they 
lookup still isn't considered "ancillary data".

And yes, a Hashed Set may likely be *implemented* as an AA with just keys, 
but that's just form, it doesn't imply a similarity in regard to function. 
The *function* of a HashSet is that of an unordered array that's been 
optimized for "contains X / doesn't contain X" on large data sets.

Containers and their functions:
- AA: Store A's with label B, A is fairly important, B may or may not be.
- Array: Store A's with label B, A is fairly important, B may or may not be, 
B has certain restrictions, container overall has different performance 
characteristacs from an AA.
- Hashed Set: Store A's.




Re: Default Delegate Parameter

2010-01-26 Thread Jesse Phillips
BCS wrote:

> Hello Jesse,
>
>> For the following code I get the bellow error. I'm wondering if I
>> should be reporting a bug, or if creating default delegates is
>> correctly prevented?
>> 
>> .\defltdg.d(10): Error: delegate defltdg.__dgliteral3 is a nested
>> function and cannot be accessed from main
>> 
>> import std.stdio;
>> 
>> void main() {
>> take(() {writeln("Hello world");});
>> take(() {});
>> take();
>> }
>> void take(void delegate() dg = () {}) {
>> dg();
>> }
>
> I don't know if this is a bug or what but I think this happens because the 
> default is defined (compile time) in the scope of take but generated (run 
> time) in the scope of main. 
>
> I'd be fine with a special case for this that either 1) allows a delegate 
> that doesn't access any outer scope to be generated like that or 2) special 
> case the code gen to correctly generate the delegate for the function (the 
> new frame pointer can be computed at that point)
>
> work around:
>
> void take() { take(() {});}
> void take(void delegate() dg) { dg(); }


At least that explains why, should have mentioned I had the work around.

Thanks.



rt_attachDisposeEvent: the apparent magic behind std.signals

2010-01-26 Thread Gareth Charnock
I was looking at the std.signals code in svn to find out how the magic 
of the observer class not needing to inherit anything was done and I was 
somewhat disappointed to see rt_attachDisposeEvent. Is this function 
standardised or exposed anywhere? I can think of cases where being able 
to listen for the dying screams of deleted objects might be useful.


Re: boolean over multiple variables

2010-01-26 Thread bearophile
Nick Sabalausky:

>I don't see how any of that argues against the idea of making "in" always 
>operate on the elements and having a different method for checking the keys.<

I have already done my best with those words, so... :-)

AA elements are its keys, that are a set. In Python3 if you have a dict named 
foo, then foo.keys() returns something that's very like a set view. And foo 
itself acts like a set (you can iterate on the items of this set, etc).
The values are the things associated to that set of elements.
So maybe your are seeing associative arrays as arrays, while in Python they are 
seen as sets with associated values :-) And seeing them as a special kind of 
set is better, because it gives you some handy syntax back, that I have shown 
you. Maybe we can change their name in D and call them "Associative Sets" :o)


>I'm sure I could, but that doesn't change anything. I've used a lot of 
>languages with lots of poorly-designed features, and I've always been able to 
>deal with the problems in those poorly-designed features. But just because I 
>can get used to dealing with them doesn't mean they're not poorly-designed or 
>that I wouldn't prefer or be better off with something different.<

What kind of problems has caused the "in" in your Python programs?


>Ex: All of us get along just fine with "if(is())", but it's still widely 
>considered a design in need of fixing. Ex: C/C++ programmers get by with its 
>system of #include and header files just fine. But obviously it still had 
>plenty of worthwhile room for improvement.<

You can find several people (me, for example) that think of those are warts or 
badly designed things (or things designed for much less powerful computers, 
#include). While if you take a sample of 5 Python and Ruby programmers you 
will not find many of them that think that "in" is badly designed  (or not very 
handy) in Python. If you search on the web you can find pages that list some 
Python warts, you will not find "in" among them. You can find things like:
class Foo:
  def __init__(self, x=[]): ...
Where people say that [] causes problems or they don't like that "self" as 
first argument, etc.

Probably I am not going to change your mind (and probably Walter's, he probably 
doesn't even reads the d.learn group), so this discussion is probably mostly 
academic :-)

Bye,
bearophile


Re: boolean over multiple variables

2010-01-26 Thread Nick Sabalausky
"bearophile"  wrote in message 
news:hjnmdl$166...@digitalmars.com...
> Nick Sabalausky:
>> Aside from that being how Python does it, why do you see that as 
>> preferable?
>
> Because:
> 1) linear searches in an array are damn common. I don't remember the 
> results of my benchmarks, but until your integer arrays is quite longer 
> than 30-50 items, performing a linear search is faster than a lookup in an 
> AA, on DMD. On Tango this number is probably 70% higher
> 1b) In Python if you perform a "foo" in "barfoo" the language doesn't 
> perform a linear search, it uses a much smarter search that has a 
> complexity lower than the product of the two lengths, using a custom 
> algorithm. So in D you can use the same syntax to search for 
> substrings/subarrays. Where such smarter search is not possible, D can use 
> a naive search.
> 2) It's really handy. I use isIn(item, items) to search on arrays in D, 
> but having a item in items is nicer.
> 3) You can use the same syntax to search into anything that's lazily 
> iterable too (a Range). This is very handy.
>

I don't see how any of that argues against the idea of making "in" always 
operate on the elements and having a different method for checking the keys. 
Can you be more clear on that point?

>
>> So having a single syntax work on the outputs for
>> regular arrays, but then on the inputs for AAs, seems highly inconsistent
>> and error-prone to me.
>
> I have followed many Python newbies personally, I am following the Python 
> newsgroups, and I have programmed for years in Python, and while I have 
> seen many different kinds of bugs, I have not seen a significant amount of 
> bugs in this. Python programmers just learn that dicts and lists are a 
> little different in this regard. At the same way they learn that a set and 
> a dict are different data structures, with different capabilities and 
> usages.
>
> Why don't you start using Python, I think in 5 days you can tell that's 
> easy to not confuse the following usages:
> 5 in {5:1, 2:2, 5:3}
> 5 in [1, 2, 5]
> "5" in "125"
> "25" in "125"
>

I'm sure I could, but that doesn't change anything. I've used a lot of 
languages with lots of poorly-designed features, and I've always been able 
to deal with the problems in those poorly-designed features. But just 
because I can get used to dealing with them doesn't mean they're not 
poorly-designed or that I wouldn't prefer or be better off with something 
different. (And for the record, I have used a bit of Python here and there. 
Still not particularly happy with it.)

Ex: All of us get along just fine with "if(is())", but it's still widely 
considered a design in need of fixing.
Ex: C/C++ programmers get by with its system of #include and header files 
just fine. But obviously it still had plenty of worthwhile room for 
improvement.




Re: Default Delegate Parameter

2010-01-26 Thread BCS

Hello Jesse,


For the following code I get the bellow error. I'm wondering if I
should be reporting a bug, or if creating default delegates is
correctly prevented?

.\defltdg.d(10): Error: delegate defltdg.__dgliteral3 is a nested
function and cannot be accessed from main

import std.stdio;

void main() {
take(() {writeln("Hello world");});
take(() {});
take();
}
void take(void delegate() dg = () {}) {
dg();
}


I don't know if this is a bug or what but I think this happens because the 
default is defined (compile time) in the scope of take but generated (run 
time) in the scope of main. 

I'd be fine with a special case for this that either 1) allows a delegate 
that doesn't access any outer scope to be generated like that or 2) special 
case the code gen to correctly generate the delegate for the function (the 
new frame pointer can be computed at that point)


work around:

void take() { take(() {});}
void take(void delegate() dg) { dg(); }



--

<




Default Delegate Parameter

2010-01-26 Thread Jesse Phillips
For the following code I get the bellow error. I'm wondering if I should be 
reporting a bug, or if creating default delegates is correctly prevented?

.\defltdg.d(10): Error: delegate defltdg.__dgliteral3 is a nested function and 
cannot be accessed from main

import std.stdio;

void main() {
   take(() {writeln("Hello world");});
   take(() {});
   take();
}


void take(void delegate() dg = () {}) {
   dg();
}


Re: boolean over multiple variables

2010-01-26 Thread Bill Baxter
On Tue, Jan 26, 2010 at 1:21 PM, bearophile  wrote:
> Nick Sabalausky:
>> Aside from that being how Python does it, why do you see that as preferable?
>
> Because:
> 1) linear searches in an array are damn common. I don't remember the results 
> of my benchmarks, but until your integer arrays is quite longer than 30-50 
> items, performing a linear search is faster than a lookup in an AA, on DMD. 
> On Tango this number is probably 70% higher
> 1b) In Python if you perform a "foo" in "barfoo" the language doesn't perform 
> a linear search, it uses a much smarter search that has a complexity lower 
> than the product of the two lengths, using a custom algorithm. So in D you 
> can use the same syntax to search for substrings/subarrays. Where such 
> smarter search is not possible, D can use a naive search.
> 2) It's really handy. I use isIn(item, items) to search on arrays in D, but 
> having a item in items is nicer.
> 3) You can use the same syntax to search into anything that's lazily iterable 
> too (a Range). This is very handy.
>
>
>> So having a single syntax work on the outputs for
>> regular arrays, but then on the inputs for AAs, seems highly inconsistent
>> and error-prone to me.
>
> I have followed many Python newbies personally, I am following the Python 
> newsgroups, and I have programmed for years in Python, and while I have seen 
> many different kinds of bugs, I have not seen a significant amount of bugs in 
> this. Python programmers just learn that dicts and lists are a little 
> different in this regard. At the same way they learn that a set and a dict 
> are different data structures, with different capabilities and usages.

It's not even really  inconsistent if you just think about these data
structures in terms of function rather than form.
An array is often used as a simple set of things.  "O in Array" means
"is O in that set of things"
An AA is a set of things that also have some associated data.  "O in
AA" means "is O in that set of things" (not the ancillary data)
If you have an actual "set" data structure for containing a set of of
things, then "O in Set" means, again, "is O in that set of things".
(In fact the closest thing D has to a built-in set type is an AA with
"don't care" associated data, reinforcing the notion of AA as a set
plus extra data.)

--bb


Re: boolean over multiple variables

2010-01-26 Thread bearophile
Nick Sabalausky:
> Aside from that being how Python does it, why do you see that as preferable? 

Because:
1) linear searches in an array are damn common. I don't remember the results of 
my benchmarks, but until your integer arrays is quite longer than 30-50 items, 
performing a linear search is faster than a lookup in an AA, on DMD. On Tango 
this number is probably 70% higher
1b) In Python if you perform a "foo" in "barfoo" the language doesn't perform a 
linear search, it uses a much smarter search that has a complexity lower than 
the product of the two lengths, using a custom algorithm. So in D you can use 
the same syntax to search for substrings/subarrays. Where such smarter search 
is not possible, D can use a naive search.
2) It's really handy. I use isIn(item, items) to search on arrays in D, but 
having a item in items is nicer.
3) You can use the same syntax to search into anything that's lazily iterable 
too (a Range). This is very handy.


> So having a single syntax work on the outputs for 
> regular arrays, but then on the inputs for AAs, seems highly inconsistent 
> and error-prone to me.

I have followed many Python newbies personally, I am following the Python 
newsgroups, and I have programmed for years in Python, and while I have seen 
many different kinds of bugs, I have not seen a significant amount of bugs in 
this. Python programmers just learn that dicts and lists are a little different 
in this regard. At the same way they learn that a set and a dict are different 
data structures, with different capabilities and usages.

Why don't you start using Python, I think in 5 days you can tell that's easy to 
not confuse the following usages:
5 in {5:1, 2:2, 5:3}
5 in [1, 2, 5]
"5" in "125"
"25" in "125"

Bye,
bearophile


Re: boolean over multiple variables

2010-01-26 Thread BCS

Hello Nick,


"Pelle Månsson"  wrote in message
news:hjmmod$1io...@digitalmars.com...


I think in should work for keys in an associative array and for
values in a regular array.

This is how it works in python.


Aside from that being how Python does it, why do you see that as
preferable? I see both arrays and associative arrays as things that
map an input value to an output value. The only significant
differences are the implementation details, and the fact that regular
arrays are more restrictive in their sets of valid inputs (must be
integers, must start with 0, and must all be consecutive values). So
having a single syntax work on the outputs for regular arrays, but
then on the inputs for AAs, seems highly inconsistent and error-prone
to me.


I think this is one of the few cases where the strictly logical choice is 
not the way anyone expect things to work.


That said however, it might make a difference in template code

void fn(T)(T t, T u, int i)
{
   if(auto x = i in t) u[i] = *x; 
}


--

<




Re: boolean over multiple variables

2010-01-26 Thread Nick Sabalausky
"Pelle Månsson"  wrote in message 
news:hjmmod$1io...@digitalmars.com...
>
> I think in should work for keys in an associative array and for values in 
> a regular array.
>
> This is how it works in python.

Aside from that being how Python does it, why do you see that as preferable? 
I see both arrays and associative arrays as things that map an input value 
to an output value. The only significant differences are the implementation 
details, and the fact that regular arrays are more restrictive in their sets 
of valid inputs (must be integers, must start with 0, and must all be 
consecutive values). So having a single syntax work on the outputs for 
regular arrays, but then on the inputs for AAs, seems highly inconsistent 
and error-prone to me.




Re: boolean over multiple variables

2010-01-26 Thread bearophile
Pelle MÃ¥nsson:
> I think in should work for keys in an associative array and for values 
> in a regular array.
> This is how it works in python.

opIn_r for normal arrays is something very natural. One of the very few persons 
that doesn't like it is Walter. Maybe I can create a small poll to see how many 
agree that this is useful, semantically clean, and a really common thing to do. 
This may change his mind or not.
Time ago I have listed few things that are both very handy and small, but 
Walter has ignored them. I think he doesn't believe in "programming in the 
small" much.

Bye,
bearophile


Re: boolean over multiple variables

2010-01-26 Thread Robert Clipsham

On 22/01/10 21:55, strtr wrote:

This may be is a very basic question, but is there a way to let me omit a 
repeating variable when doing multiple boolean operations?

if ( var == a || var == b || var == c || var == d)
if ( var == (a || b || c || d) )


/**
 * Untested code, it works something like this though
 * Find tools at:
 *  http://dsource.org/projects/scrapple/browser/trunk/tools/tools
 */

import tools.base;

void main()
{
  T var, a, b, c, d;
  ...
  if ( var == a /or/ b /or/ c /or/ d )
  {
/**
 * The same as:
 * 
 * if ( var == a || var == b || var == c || var == d)
 * {
 *...
 * }
 */
  }
}


Re: boolean over multiple variables

2010-01-26 Thread Pelle Månsson

On 01/26/2010 01:02 AM, Nick Sabalausky wrote:

"strtr"  wrote in message
news:hjd6t1$be...@digitalmars.com...

This may be is a very basic question, but is there a way to let me omit a
repeating variable when doing multiple boolean operations?

if ( var == a || var == b || var == c || var == d)
if ( var == (a || b || c || d) )


I do this:

-
import tango.core.Array;

void main()
{
 if( [3, 5, 6, 12].contains(7) )
 {
 }
}
-

There's probably a phobos equivilent, too.

Alhough, I would much prefer what other people mentioned about having "in"
refer to the values of a collection rather than the keys. But I've been
using the above as a substitute.


I think in should work for keys in an associative array and for values 
in a regular array.


This is how it works in python.