Re: Order preservation and duplicate removal policy in `distinct`

2017-01-03 Thread Mike Rodriguez
Thanks for the feedback Alex.

As far as:
> If you wanted to file a jira on anything here, a jira to add a line to 
the doc string stating that the first duplicate is kept would be the only 
thing possibly worth doing.

I'll get one logged then.



On Saturday, December 31, 2016 at 12:05:16 PM UTC-5, Alex Miller wrote:
>
> Replying to many things in this thread at once here...
>
> Re lazy sequences, I think you can take it as implicit and rely on the 
> input seq order is retained (same as other sequence functions).
>
> Re duplicates, the current implementation retains the first element, but 
> as mentioned this is not stated in the doc string (so probably should not 
> be something you rely upon). 
>
> "Equality" in this case is based upon set contains? checks, which 
> ultimately use Clojure's "equiv" notion of equality (NOT Java's .equals). 
> In general for Clojure, equality is based on values and two duplicate 
> values will be indistinguishable. However, two cases where that might not 
> be the case are when they have meta (which is not considered in equality) 
> or if they are arbitrary Java objects (which fall back to .equals behavior).
>
> If you wanted to file a jira on anything here, a jira to add a line to the 
> doc string stating that the first duplicate is kept would be the only thing 
> possibly worth doing.
>
> Alex
>
>
> On Wednesday, December 28, 2016 at 10:22:53 AM UTC-6, Mike Rodriguez wrote:
>>
>> The doc for `distinct` is:
>> "Returns a lazy sequence of the elements of coll with duplicates removed.
>>   Returns a stateful transducer when no collection is provided."
>>
>> (1) In the lazy sequence case, I've thought that maybe it is assuemd 
>> there is a guarantee that the order of the input seq is preserved. 
>>  However, this isn't stated.  Is this an assumption to rely on for 
>> `distinct` and, more generally, the Clojure seq-based API functions?
>>
>> (2) In either case, when there are duplicates, there do not seem to be 
>> any guarantees on which one of the duplicates will be preserved.  Should 
>> this be stated?  I'm thinking that maybe this is about Clojure's design 
>> philosophy being that equal values to not ever need to be distinguished 
>> between, so the API doesn't explicitly support this concern.  However, 
>> there are times when identity relationships can matter - performance would 
>> be one that comes to mind.
>> - This has some relationship to the Scala question @ 
>> http://stackoverflow.com/questions/6735568/scala-seqlike-distinct-preserves-order
>>
>> There have been a few occasions where I relied on (or wanted to rely on) 
>> (1).  I haven't had many cases where (2) matters, but I could see it coming 
>> up on perhaps rare occasions. 
>>
>>

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Order preservation and duplicate removal policy in `distinct`

2016-12-31 Thread Alex Miller
Replying to many things in this thread at once here...

Re lazy sequences, I think you can take it as implicit and rely on the 
input seq order is retained (same as other sequence functions).

Re duplicates, the current implementation retains the first element, but as 
mentioned this is not stated in the doc string (so probably should not be 
something you rely upon). 

"Equality" in this case is based upon set contains? checks, which 
ultimately use Clojure's "equiv" notion of equality (NOT Java's .equals). 
In general for Clojure, equality is based on values and two duplicate 
values will be indistinguishable. However, two cases where that might not 
be the case are when they have meta (which is not considered in equality) 
or if they are arbitrary Java objects (which fall back to .equals behavior).

If you wanted to file a jira on anything here, a jira to add a line to the 
doc string stating that the first duplicate is kept would be the only thing 
possibly worth doing.

Alex


On Wednesday, December 28, 2016 at 10:22:53 AM UTC-6, Mike Rodriguez wrote:
>
> The doc for `distinct` is:
> "Returns a lazy sequence of the elements of coll with duplicates removed.
>   Returns a stateful transducer when no collection is provided."
>
> (1) In the lazy sequence case, I've thought that maybe it is assuemd there 
> is a guarantee that the order of the input seq is preserved.  However, this 
> isn't stated.  Is this an assumption to rely on for `distinct` and, more 
> generally, the Clojure seq-based API functions?
>
> (2) In either case, when there are duplicates, there do not seem to be any 
> guarantees on which one of the duplicates will be preserved.  Should this 
> be stated?  I'm thinking that maybe this is about Clojure's design 
> philosophy being that equal values to not ever need to be distinguished 
> between, so the API doesn't explicitly support this concern.  However, 
> there are times when identity relationships can matter - performance would 
> be one that comes to mind.
> - This has some relationship to the Scala question @ 
> http://stackoverflow.com/questions/6735568/scala-seqlike-distinct-preserves-order
>
> There have been a few occasions where I relied on (or wanted to rely on) 
> (1).  I haven't had many cases where (2) matters, but I could see it coming 
> up on perhaps rare occasions. 
>
>

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Order preservation and duplicate removal policy in `distinct`

2016-12-30 Thread Mike Rodriguez
Yeah, I was thinking about logging the ticket for it.  I just figured I'd 
discuss it on the google groups first to see if anyone else thought it was 
a useful concern.
It seems that some people have opinions on in it in both directions 
perhaps, i.e. docs are sufficient vs docs are not sufficient.


On Thursday, December 29, 2016 at 5:50:14 PM UTC-6, Matching Socks wrote:
>
> How about a ticket for enhancement of the API documentation to clarify
> the nature of distinct's parameter (any seqable, even lazy)?  That would
> distinguish it from, e.g., (dedupe (sort coll)).
>

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Order preservation and duplicate removal policy in `distinct`

2016-12-30 Thread Mike Rodriguez
On Thursday, December 29, 2016 at 5:47:14 PM UTC-6, Erik Assum wrote:
>
> Wouldn't the order be different depending on wether you keep the first or 
> the last?
>
> (distinct [1 2 1])
> => [1 2]
> vs
> (distinct [1 2 1])
> => [2 1]
>
> Erik. 
> -- 
> i farta
>
>
I should have thought about this scenario beforehand.  I was almost 
convinced here that there was no usefulness in knowing which duplicate was 
kept.  However, it can effect order so really the part (2) problem relates 
to part (1).  Thanks for pointing that out.

And in reference to this one:
On Thursday, December 29, 2016 at 4:16:20 PM UTC-6, puzzler wrote:
> They may have different metadata, or some objects may already have cached 
hash values while others do not, or they may be complex enough objects that 
for a later part in the program it matters that certain equal objects meet 
the equality test quickly by actually being identical objects, not just 
equal.
 
These were the sort of things I was thinking about originally as the issue 
with my point (2).  I do agree that object identity has properties that can 
matter.  The metadata is a particularly good point.  
However, as I said above, the ordering concern is more of the issue I was 
mostly concerned with in general.

I appreciate everyone's input on this.

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Order preservation and duplicate removal policy in `distinct`

2016-12-29 Thread Sean Corfield
I was trying to understand part 2 of the OP’s question (per the piece I had 
quoted in my response).

 

Part 1 of the OP’s question is pretty clear cut: the order of the result 
depends on which element is kept.

 

Given that, if you *don’t* care about the order, you could use `set` instead of 
`distinct`, it could reasonably be argued that `distinct` should indeed provide 
an ordering guarantee. Note that part 2 of the OP’s question would still hold 
for calling `set`.

 

Sean Corfield -- (970) FOR-SEAN -- (904) 302-SEAN
An Architect's View -- http://corfield.org/

"If you're not annoying somebody, you're not really alive."
-- Margaret Atwood

 

On 12/29/16, 3:46 PM, "Erik Assum"  wrote:

 

Wouldn't the order be different depending on wether you keep the first or the 
last?

 

(distinct [1 2 1])

=> [1 2]

vs

(distinct [1 2 1])

=> [2 1]

Erik. 

-- 

i farta


Den 29. des. 2016 kl. 22.32 skrev Sean Corfield :

 

Can you provide a scenario when it matters? Given that you had two immutable, 
equal values in a collection, when would it matter which one was discarded and 
which one was kept?

 

 

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Order preservation and duplicate removal policy in `distinct`

2016-12-29 Thread Matching Socks
How about a ticket for enhancement of the API documentation to clarify
the nature of distinct's parameter (any seqable, even lazy)?  That would
distinguish it from, e.g., (dedupe (sort coll)).

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Order preservation and duplicate removal policy in `distinct`

2016-12-29 Thread Erik Assum
Wouldn't the order be different depending on wether you keep the first or the 
last?

(distinct [1 2 1])
=> [1 2]
vs
(distinct [1 2 1])
=> [2 1]

Erik. 
-- 
i farta

> Den 29. des. 2016 kl. 22.32 skrev Sean Corfield :
>  
> Can you provide a scenario when it matters? Given that you had two immutable, 
> equal values in a collection, when would it matter which one was discarded 
> and which one was kept?
>  
> 

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Order preservation and duplicate removal policy in `distinct`

2016-12-29 Thread Mike Rodriguez
> f it helps anyone sleep better at night, were the behavior of distinct ever 
> to change in a way that breaks one's application, the original one is right 
> there in the git history, available for everyone's copying and use, with 
> whatever promises in the doc string you choose to add.

I understand it is easy to just fix it once it breaks. The point I had is just 
that without a guarantee it's an assumption. And if you don't think about it 
and it later changes, it can be subtle and hard to find. 

This actually came up as a discussion with my coworkers one day.  Someone 
mentioned Clojure doesn't seem to have any clear contract around if distinct 
preserves order or not. And I had a hard time arguing that you can "just trust 
it" since it doesn't really state that it does. 

So the main suggestion I'm getting is to just write my own test for this sort 
of assumption to avoid issues. I'd prefer some sort of doc or a generalized 
idea that all seq consuming, lazy seq returning functions build in a order 
preserving way. If that makes any sense. (I'm thinking like map and filter and 
so on relate to this idea). 

It's not really a big deal. Just a discussion I had come up "in the wild 
before". 

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Order preservation and duplicate removal policy in `distinct`

2016-12-29 Thread Mike Rodriguez
Yeah. It is so hard to come up with a real use case here after I think about it 
that it is best to just let it be. 

It would only matter if identity mattered for something, but still hard to even 
contrive a scenario. So part (2) solved. 

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Order preservation and duplicate removal policy in `distinct`

2016-12-29 Thread Mark Engelberg
On Thu, Dec 29, 2016 at 1:32 PM, Sean Corfield  wrote:

>
> > I'm just guessing there the answer may just be "equal values are equal
> and you should never care which one you get out".  There are times to care
> though, but then perhaps just don't use `distinct` or be sure to have a
> test on it.  :P
>
>
> Can you provide a scenario when it matters? Given that you had two
> immutable, equal values in a collection, when would it matter which one was
> discarded and which one was kept?
>
>
They may have different metadata, or some objects may already have cached
hash values while others do not, or they may be complex enough objects that
for a later part in the program it matters that certain equal objects meet
the equality test quickly by actually being identical objects, not just
equal.

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Order preservation and duplicate removal policy in `distinct`

2016-12-29 Thread Andy Fingerhut
If it helps anyone sleep better at night, were the behavior of distinct
ever to change in a way that breaks one's application, the original one is
right there in the git history, available for everyone's copying and use,
with whatever promises in the doc string you choose to add.

Andy

On Thu, Dec 29, 2016 at 1:15 PM, Mike Rodriguez  wrote:

> Yeah, adding a test for undocumented behavior seems somewhat reasonable.
> I do wish the docs would be a bit clearer on these aspects of the
> contract.  Without that it just doesn't seem that there is any real
> commitment to the Clojure implementation to not change later.
>
> I understand the general idea of "it is the most natural way to implement
> it".  However, that is shaky grounds to rely on.  It also makes me question
> if I've really thought it through to know "there isn't any other way to do
> it".  In the case of `distinct` I can be fairly sure it won't reorder them.
>
> Also, it still isn't clear if it keeps the first or later duplicates or
> not.  That was the (2) part of the question.  I'm just guessing there the
> answer may just be "equal values are equal and you should never care which
> one you get out".  There are times to care though, but then perhaps just
> don't use `distinct` or be sure to have a test on it.  :P
>
> I asked this question mostly out of curiosity and to see what others
> thought.  Also, to bring up the issue of if the docs are sufficient.
>
>
> On Wednesday, December 28, 2016 at 3:38:03 PM UTC-6, tbc++ wrote:
>>
>> This is one of those odd questions where the answer of what "could"
>> happen and what "will most likely happen" are completely different. There
>> is no reason why `distinct` should reorder or which item will be preserved.
>> However there's really only one logical way to implement this (the way it's
>> currently implemented) and in that case the answer would be "the first
>> duplicate is used", and "items are not reordered".
>>
>> So all that to say, there's nothing in the docs that specify this is the
>> way it has to be, but it is the way it is now, is most likely not going to
>> change. So if you're really concerned about it, I'd say write a test around
>> distinct and wait for it to break someday if Clojure changes the undefined
>> behavior.
>>
>> On Wed, Dec 28, 2016 at 9:55 AM, Michael Blume 
>> wrote:
>>
>>> Also, I'm assuming distinct uses .equals semantics which might be worth
>>> calling out in the doc
>>>
>>> On Wed, Dec 28, 2016, 11:22 AM Mike Rodriguez  wrote:
>>>
 The doc for `distinct` is:
 "Returns a lazy sequence of the elements of coll with duplicates
 removed.
   Returns a stateful transducer when no collection is provided."

 (1) In the lazy sequence case, I've thought that maybe it is assuemd
 there is a guarantee that the order of the input seq is preserved.
 However, this isn't stated.  Is this an assumption to rely on for
 `distinct` and, more generally, the Clojure seq-based API functions?

 (2) In either case, when there are duplicates, there do not seem to be
 any guarantees on which one of the duplicates will be preserved.  Should
 this be stated?  I'm thinking that maybe this is about Clojure's design
 philosophy being that equal values to not ever need to be distinguished
 between, so the API doesn't explicitly support this concern.  However,
 there are times when identity relationships can matter - performance would
 be one that comes to mind.
 - This has some relationship to the Scala question @
 http://stackoverflow.com/questions/6735568/scala-seqlike-
 distinct-preserves-order

 There have been a few occasions where I relied on (or wanted to rely
 on) (1).  I haven't had many cases where (2) matters, but I could see it
 coming up on perhaps rare occasions.

 --
 You received this message because you are subscribed to the Google
 Groups "Clojure" group.
 To post to this group, send email to clo...@googlegroups.com
 Note that posts from new members are moderated - please be patient with
 your first post.
 To unsubscribe from this group, send email to
 clojure+u...@googlegroups.com
 For more options, visit this group at
 http://groups.google.com/group/clojure?hl=en
 ---
 You received this message because you are subscribed to the Google
 Groups "Clojure" group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to clojure+u...@googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.

>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "Clojure" group.
>>> To post to this group, send email to clo...@googlegroups.com
>>> Note that posts from new members are moderated - please be patient with
>>> your first post.
>>> To unsubscribe from this group, send email to
>>> clojure+u...@googlegroups.com
>>> For more options, visit this group 

Re: Order preservation and duplicate removal policy in `distinct`

2016-12-29 Thread Sean Corfield
> I'm just guessing there the answer may just be "equal values are equal and 
> you should never care which one you get out".  There are times to care 
> though, but then perhaps just don't use `distinct` or be sure to have a test 
> on it.  :P

 

Can you provide a scenario when it matters? Given that you had two immutable, 
equal values in a collection, when would it matter which one was discarded and 
which one was kept?

 

Sean Corfield -- (970) FOR-SEAN -- (904) 302-SEAN
An Architect's View -- http://corfield.org/

"If you're not annoying somebody, you're not really alive."
-- Margaret Atwood

 

On 12/29/16, 1:15 PM, "Mike Rodriguez"  wrote:

 

Yeah, adding a test for undocumented behavior seems somewhat reasonable.  I do 
wish the docs would be a bit clearer on these aspects of the contract.  Without 
that it just doesn't seem that there is any real commitment to the Clojure 
implementation to not change later.

 

I understand the general idea of "it is the most natural way to implement it".  
However, that is shaky grounds to rely on.  It also makes me question if I've 
really thought it through to know "there isn't any other way to do it".  In the 
case of `distinct` I can be fairly sure it won't reorder them.

 

Also, it still isn't clear if it keeps the first or later duplicates or not.  
That was the (2) part of the question.  I'm just guessing there the answer may 
just be "equal values are equal and you should never care which one you get 
out".  There are times to care though, but then perhaps just don't use 
`distinct` or be sure to have a test on it.  :P

 

I asked this question mostly out of curiosity and to see what others thought.  
Also, to bring up the issue of if the docs are sufficient.  

 

 

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Order preservation and duplicate removal policy in `distinct`

2016-12-29 Thread Mike Rodriguez
Yeah, adding a test for undocumented behavior seems somewhat reasonable.  I 
do wish the docs would be a bit clearer on these aspects of the contract. 
 Without that it just doesn't seem that there is any real commitment to the 
Clojure implementation to not change later.

I understand the general idea of "it is the most natural way to implement 
it".  However, that is shaky grounds to rely on.  It also makes me question 
if I've really thought it through to know "there isn't any other way to do 
it".  In the case of `distinct` I can be fairly sure it won't reorder them.

Also, it still isn't clear if it keeps the first or later duplicates or 
not.  That was the (2) part of the question.  I'm just guessing there the 
answer may just be "equal values are equal and you should never care which 
one you get out".  There are times to care though, but then perhaps just 
don't use `distinct` or be sure to have a test on it.  :P

I asked this question mostly out of curiosity and to see what others 
thought.  Also, to bring up the issue of if the docs are sufficient.  


On Wednesday, December 28, 2016 at 3:38:03 PM UTC-6, tbc++ wrote:
>
> This is one of those odd questions where the answer of what "could" happen 
> and what "will most likely happen" are completely different. There is no 
> reason why `distinct` should reorder or which item will be preserved. 
> However there's really only one logical way to implement this (the way it's 
> currently implemented) and in that case the answer would be "the first 
> duplicate is used", and "items are not reordered". 
>
> So all that to say, there's nothing in the docs that specify this is the 
> way it has to be, but it is the way it is now, is most likely not going to 
> change. So if you're really concerned about it, I'd say write a test around 
> distinct and wait for it to break someday if Clojure changes the undefined 
> behavior. 
>
> On Wed, Dec 28, 2016 at 9:55 AM, Michael Blume  > wrote:
>
>> Also, I'm assuming distinct uses .equals semantics which might be worth 
>> calling out in the doc
>>
>> On Wed, Dec 28, 2016, 11:22 AM Mike Rodriguez > > wrote:
>>
>>> The doc for `distinct` is:
>>> "Returns a lazy sequence of the elements of coll with duplicates removed.
>>>   Returns a stateful transducer when no collection is provided."
>>>
>>> (1) In the lazy sequence case, I've thought that maybe it is assuemd 
>>> there is a guarantee that the order of the input seq is preserved.  
>>> However, this isn't stated.  Is this an assumption to rely on for 
>>> `distinct` and, more generally, the Clojure seq-based API functions?
>>>
>>> (2) In either case, when there are duplicates, there do not seem to be 
>>> any guarantees on which one of the duplicates will be preserved.  Should 
>>> this be stated?  I'm thinking that maybe this is about Clojure's design 
>>> philosophy being that equal values to not ever need to be distinguished 
>>> between, so the API doesn't explicitly support this concern.  However, 
>>> there are times when identity relationships can matter - performance would 
>>> be one that comes to mind.
>>> - This has some relationship to the Scala question @ 
>>> http://stackoverflow.com/questions/6735568/scala-seqlike-distinct-preserves-order
>>>
>>> There have been a few occasions where I relied on (or wanted to rely on) 
>>> (1).  I haven't had many cases where (2) matters, but I could see it coming 
>>> up on perhaps rare occasions. 
>>>
>>> -- 
>>> You received this message because you are subscribed to the Google
>>> Groups "Clojure" group.
>>> To post to this group, send email to clo...@googlegroups.com 
>>> 
>>> Note that posts from new members are moderated - please be patient with 
>>> your first post.
>>> To unsubscribe from this group, send email to
>>> clojure+u...@googlegroups.com 
>>> For more options, visit this group at
>>> http://groups.google.com/group/clojure?hl=en
>>> --- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "Clojure" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to clojure+u...@googlegroups.com .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>> -- 
>> You received this message because you are subscribed to the Google
>> Groups "Clojure" group.
>> To post to this group, send email to clo...@googlegroups.com 
>> 
>> Note that posts from new members are moderated - please be patient with 
>> your first post.
>> To unsubscribe from this group, send email to
>> clojure+u...@googlegroups.com 
>> For more options, visit this group at
>> http://groups.google.com/group/clojure?hl=en
>> --- 
>> You received this message because you are subscribed to the Google Groups 
>> "Clojure" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to clojure+u...@googlegroups.com .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
>
> -- 
> “One of the main causes of the fall of th

Re: Order preservation and duplicate removal policy in `distinct`

2016-12-28 Thread Timothy Baldridge
This is one of those odd questions where the answer of what "could" happen
and what "will most likely happen" are completely different. There is no
reason why `distinct` should reorder or which item will be preserved.
However there's really only one logical way to implement this (the way it's
currently implemented) and in that case the answer would be "the first
duplicate is used", and "items are not reordered".

So all that to say, there's nothing in the docs that specify this is the
way it has to be, but it is the way it is now, is most likely not going to
change. So if you're really concerned about it, I'd say write a test around
distinct and wait for it to break someday if Clojure changes the undefined
behavior.

On Wed, Dec 28, 2016 at 9:55 AM, Michael Blume  wrote:

> Also, I'm assuming distinct uses .equals semantics which might be worth
> calling out in the doc
>
> On Wed, Dec 28, 2016, 11:22 AM Mike Rodriguez  wrote:
>
>> The doc for `distinct` is:
>> "Returns a lazy sequence of the elements of coll with duplicates removed.
>>   Returns a stateful transducer when no collection is provided."
>>
>> (1) In the lazy sequence case, I've thought that maybe it is assuemd
>> there is a guarantee that the order of the input seq is preserved.
>> However, this isn't stated.  Is this an assumption to rely on for
>> `distinct` and, more generally, the Clojure seq-based API functions?
>>
>> (2) In either case, when there are duplicates, there do not seem to be
>> any guarantees on which one of the duplicates will be preserved.  Should
>> this be stated?  I'm thinking that maybe this is about Clojure's design
>> philosophy being that equal values to not ever need to be distinguished
>> between, so the API doesn't explicitly support this concern.  However,
>> there are times when identity relationships can matter - performance would
>> be one that comes to mind.
>> - This has some relationship to the Scala question @
>> http://stackoverflow.com/questions/6735568/scala-
>> seqlike-distinct-preserves-order
>>
>> There have been a few occasions where I relied on (or wanted to rely on)
>> (1).  I haven't had many cases where (2) matters, but I could see it coming
>> up on perhaps rare occasions.
>>
>> --
>> You received this message because you are subscribed to the Google
>> Groups "Clojure" group.
>> To post to this group, send email to clojure@googlegroups.com
>> Note that posts from new members are moderated - please be patient with
>> your first post.
>> To unsubscribe from this group, send email to
>> clojure+unsubscr...@googlegroups.com
>> For more options, visit this group at
>> http://groups.google.com/group/clojure?hl=en
>> ---
>> You received this message because you are subscribed to the Google Groups
>> "Clojure" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to clojure+unsubscr...@googlegroups.com.
>> For more options, visit https://groups.google.com/d/optout.
>>
> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clojure@googlegroups.com
> Note that posts from new members are moderated - please be patient with
> your first post.
> To unsubscribe from this group, send email to
> clojure+unsubscr...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en
> ---
> You received this message because you are subscribed to the Google Groups
> "Clojure" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to clojure+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>



-- 
“One of the main causes of the fall of the Roman Empire was that–lacking
zero–they had no way to indicate successful termination of their C
programs.”
(Robert Firth)

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Order preservation and duplicate removal policy in `distinct`

2016-12-28 Thread Michael Blume
Also, I'm assuming distinct uses .equals semantics which might be worth
calling out in the doc

On Wed, Dec 28, 2016, 11:22 AM Mike Rodriguez  wrote:

> The doc for `distinct` is:
> "Returns a lazy sequence of the elements of coll with duplicates removed.
>   Returns a stateful transducer when no collection is provided."
>
> (1) In the lazy sequence case, I've thought that maybe it is assuemd there
> is a guarantee that the order of the input seq is preserved.  However, this
> isn't stated.  Is this an assumption to rely on for `distinct` and, more
> generally, the Clojure seq-based API functions?
>
> (2) In either case, when there are duplicates, there do not seem to be any
> guarantees on which one of the duplicates will be preserved.  Should this
> be stated?  I'm thinking that maybe this is about Clojure's design
> philosophy being that equal values to not ever need to be distinguished
> between, so the API doesn't explicitly support this concern.  However,
> there are times when identity relationships can matter - performance would
> be one that comes to mind.
> - This has some relationship to the Scala question @
> http://stackoverflow.com/questions/6735568/scala-seqlike-distinct-preserves-order
>
> There have been a few occasions where I relied on (or wanted to rely on)
> (1).  I haven't had many cases where (2) matters, but I could see it coming
> up on perhaps rare occasions.
>
> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clojure@googlegroups.com
> Note that posts from new members are moderated - please be patient with
> your first post.
> To unsubscribe from this group, send email to
> clojure+unsubscr...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en
> ---
> You received this message because you are subscribed to the Google Groups
> "Clojure" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to clojure+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.