Re: Order preservation and duplicate removal policy in `distinct`
Thanks for the feedback Alex. As far as: > If you wanted to file a jira on anything here, a jira to add a line to the doc string stating that the first duplicate is kept would be the only thing possibly worth doing. I'll get one logged then. On Saturday, December 31, 2016 at 12:05:16 PM UTC-5, Alex Miller wrote: > > Replying to many things in this thread at once here... > > Re lazy sequences, I think you can take it as implicit and rely on the > input seq order is retained (same as other sequence functions). > > Re duplicates, the current implementation retains the first element, but > as mentioned this is not stated in the doc string (so probably should not > be something you rely upon). > > "Equality" in this case is based upon set contains? checks, which > ultimately use Clojure's "equiv" notion of equality (NOT Java's .equals). > In general for Clojure, equality is based on values and two duplicate > values will be indistinguishable. However, two cases where that might not > be the case are when they have meta (which is not considered in equality) > or if they are arbitrary Java objects (which fall back to .equals behavior). > > If you wanted to file a jira on anything here, a jira to add a line to the > doc string stating that the first duplicate is kept would be the only thing > possibly worth doing. > > Alex > > > On Wednesday, December 28, 2016 at 10:22:53 AM UTC-6, Mike Rodriguez wrote: >> >> The doc for `distinct` is: >> "Returns a lazy sequence of the elements of coll with duplicates removed. >> Returns a stateful transducer when no collection is provided." >> >> (1) In the lazy sequence case, I've thought that maybe it is assuemd >> there is a guarantee that the order of the input seq is preserved. >> However, this isn't stated. Is this an assumption to rely on for >> `distinct` and, more generally, the Clojure seq-based API functions? >> >> (2) In either case, when there are duplicates, there do not seem to be >> any guarantees on which one of the duplicates will be preserved. Should >> this be stated? I'm thinking that maybe this is about Clojure's design >> philosophy being that equal values to not ever need to be distinguished >> between, so the API doesn't explicitly support this concern. However, >> there are times when identity relationships can matter - performance would >> be one that comes to mind. >> - This has some relationship to the Scala question @ >> http://stackoverflow.com/questions/6735568/scala-seqlike-distinct-preserves-order >> >> There have been a few occasions where I relied on (or wanted to rely on) >> (1). I haven't had many cases where (2) matters, but I could see it coming >> up on perhaps rare occasions. >> >> -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups "Clojure" group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Order preservation and duplicate removal policy in `distinct`
Replying to many things in this thread at once here... Re lazy sequences, I think you can take it as implicit and rely on the input seq order is retained (same as other sequence functions). Re duplicates, the current implementation retains the first element, but as mentioned this is not stated in the doc string (so probably should not be something you rely upon). "Equality" in this case is based upon set contains? checks, which ultimately use Clojure's "equiv" notion of equality (NOT Java's .equals). In general for Clojure, equality is based on values and two duplicate values will be indistinguishable. However, two cases where that might not be the case are when they have meta (which is not considered in equality) or if they are arbitrary Java objects (which fall back to .equals behavior). If you wanted to file a jira on anything here, a jira to add a line to the doc string stating that the first duplicate is kept would be the only thing possibly worth doing. Alex On Wednesday, December 28, 2016 at 10:22:53 AM UTC-6, Mike Rodriguez wrote: > > The doc for `distinct` is: > "Returns a lazy sequence of the elements of coll with duplicates removed. > Returns a stateful transducer when no collection is provided." > > (1) In the lazy sequence case, I've thought that maybe it is assuemd there > is a guarantee that the order of the input seq is preserved. However, this > isn't stated. Is this an assumption to rely on for `distinct` and, more > generally, the Clojure seq-based API functions? > > (2) In either case, when there are duplicates, there do not seem to be any > guarantees on which one of the duplicates will be preserved. Should this > be stated? I'm thinking that maybe this is about Clojure's design > philosophy being that equal values to not ever need to be distinguished > between, so the API doesn't explicitly support this concern. However, > there are times when identity relationships can matter - performance would > be one that comes to mind. > - This has some relationship to the Scala question @ > http://stackoverflow.com/questions/6735568/scala-seqlike-distinct-preserves-order > > There have been a few occasions where I relied on (or wanted to rely on) > (1). I haven't had many cases where (2) matters, but I could see it coming > up on perhaps rare occasions. > > -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups "Clojure" group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Order preservation and duplicate removal policy in `distinct`
Yeah, I was thinking about logging the ticket for it. I just figured I'd discuss it on the google groups first to see if anyone else thought it was a useful concern. It seems that some people have opinions on in it in both directions perhaps, i.e. docs are sufficient vs docs are not sufficient. On Thursday, December 29, 2016 at 5:50:14 PM UTC-6, Matching Socks wrote: > > How about a ticket for enhancement of the API documentation to clarify > the nature of distinct's parameter (any seqable, even lazy)? That would > distinguish it from, e.g., (dedupe (sort coll)). > -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups "Clojure" group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Order preservation and duplicate removal policy in `distinct`
On Thursday, December 29, 2016 at 5:47:14 PM UTC-6, Erik Assum wrote: > > Wouldn't the order be different depending on wether you keep the first or > the last? > > (distinct [1 2 1]) > => [1 2] > vs > (distinct [1 2 1]) > => [2 1] > > Erik. > -- > i farta > > I should have thought about this scenario beforehand. I was almost convinced here that there was no usefulness in knowing which duplicate was kept. However, it can effect order so really the part (2) problem relates to part (1). Thanks for pointing that out. And in reference to this one: On Thursday, December 29, 2016 at 4:16:20 PM UTC-6, puzzler wrote: > They may have different metadata, or some objects may already have cached hash values while others do not, or they may be complex enough objects that for a later part in the program it matters that certain equal objects meet the equality test quickly by actually being identical objects, not just equal. These were the sort of things I was thinking about originally as the issue with my point (2). I do agree that object identity has properties that can matter. The metadata is a particularly good point. However, as I said above, the ordering concern is more of the issue I was mostly concerned with in general. I appreciate everyone's input on this. -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups "Clojure" group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Order preservation and duplicate removal policy in `distinct`
I was trying to understand part 2 of the OP’s question (per the piece I had quoted in my response). Part 1 of the OP’s question is pretty clear cut: the order of the result depends on which element is kept. Given that, if you *don’t* care about the order, you could use `set` instead of `distinct`, it could reasonably be argued that `distinct` should indeed provide an ordering guarantee. Note that part 2 of the OP’s question would still hold for calling `set`. Sean Corfield -- (970) FOR-SEAN -- (904) 302-SEAN An Architect's View -- http://corfield.org/ "If you're not annoying somebody, you're not really alive." -- Margaret Atwood On 12/29/16, 3:46 PM, "Erik Assum" wrote: Wouldn't the order be different depending on wether you keep the first or the last? (distinct [1 2 1]) => [1 2] vs (distinct [1 2 1]) => [2 1] Erik. -- i farta Den 29. des. 2016 kl. 22.32 skrev Sean Corfield : Can you provide a scenario when it matters? Given that you had two immutable, equal values in a collection, when would it matter which one was discarded and which one was kept? -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups "Clojure" group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups "Clojure" group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Order preservation and duplicate removal policy in `distinct`
How about a ticket for enhancement of the API documentation to clarify the nature of distinct's parameter (any seqable, even lazy)? That would distinguish it from, e.g., (dedupe (sort coll)). -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups "Clojure" group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Order preservation and duplicate removal policy in `distinct`
Wouldn't the order be different depending on wether you keep the first or the last? (distinct [1 2 1]) => [1 2] vs (distinct [1 2 1]) => [2 1] Erik. -- i farta > Den 29. des. 2016 kl. 22.32 skrev Sean Corfield : > > Can you provide a scenario when it matters? Given that you had two immutable, > equal values in a collection, when would it matter which one was discarded > and which one was kept? > > -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups "Clojure" group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Order preservation and duplicate removal policy in `distinct`
> f it helps anyone sleep better at night, were the behavior of distinct ever > to change in a way that breaks one's application, the original one is right > there in the git history, available for everyone's copying and use, with > whatever promises in the doc string you choose to add. I understand it is easy to just fix it once it breaks. The point I had is just that without a guarantee it's an assumption. And if you don't think about it and it later changes, it can be subtle and hard to find. This actually came up as a discussion with my coworkers one day. Someone mentioned Clojure doesn't seem to have any clear contract around if distinct preserves order or not. And I had a hard time arguing that you can "just trust it" since it doesn't really state that it does. So the main suggestion I'm getting is to just write my own test for this sort of assumption to avoid issues. I'd prefer some sort of doc or a generalized idea that all seq consuming, lazy seq returning functions build in a order preserving way. If that makes any sense. (I'm thinking like map and filter and so on relate to this idea). It's not really a big deal. Just a discussion I had come up "in the wild before". -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups "Clojure" group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Order preservation and duplicate removal policy in `distinct`
Yeah. It is so hard to come up with a real use case here after I think about it that it is best to just let it be. It would only matter if identity mattered for something, but still hard to even contrive a scenario. So part (2) solved. -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups "Clojure" group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Order preservation and duplicate removal policy in `distinct`
On Thu, Dec 29, 2016 at 1:32 PM, Sean Corfield wrote: > > > I'm just guessing there the answer may just be "equal values are equal > and you should never care which one you get out". There are times to care > though, but then perhaps just don't use `distinct` or be sure to have a > test on it. :P > > > Can you provide a scenario when it matters? Given that you had two > immutable, equal values in a collection, when would it matter which one was > discarded and which one was kept? > > They may have different metadata, or some objects may already have cached hash values while others do not, or they may be complex enough objects that for a later part in the program it matters that certain equal objects meet the equality test quickly by actually being identical objects, not just equal. -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups "Clojure" group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Order preservation and duplicate removal policy in `distinct`
If it helps anyone sleep better at night, were the behavior of distinct ever to change in a way that breaks one's application, the original one is right there in the git history, available for everyone's copying and use, with whatever promises in the doc string you choose to add. Andy On Thu, Dec 29, 2016 at 1:15 PM, Mike Rodriguez wrote: > Yeah, adding a test for undocumented behavior seems somewhat reasonable. > I do wish the docs would be a bit clearer on these aspects of the > contract. Without that it just doesn't seem that there is any real > commitment to the Clojure implementation to not change later. > > I understand the general idea of "it is the most natural way to implement > it". However, that is shaky grounds to rely on. It also makes me question > if I've really thought it through to know "there isn't any other way to do > it". In the case of `distinct` I can be fairly sure it won't reorder them. > > Also, it still isn't clear if it keeps the first or later duplicates or > not. That was the (2) part of the question. I'm just guessing there the > answer may just be "equal values are equal and you should never care which > one you get out". There are times to care though, but then perhaps just > don't use `distinct` or be sure to have a test on it. :P > > I asked this question mostly out of curiosity and to see what others > thought. Also, to bring up the issue of if the docs are sufficient. > > > On Wednesday, December 28, 2016 at 3:38:03 PM UTC-6, tbc++ wrote: >> >> This is one of those odd questions where the answer of what "could" >> happen and what "will most likely happen" are completely different. There >> is no reason why `distinct` should reorder or which item will be preserved. >> However there's really only one logical way to implement this (the way it's >> currently implemented) and in that case the answer would be "the first >> duplicate is used", and "items are not reordered". >> >> So all that to say, there's nothing in the docs that specify this is the >> way it has to be, but it is the way it is now, is most likely not going to >> change. So if you're really concerned about it, I'd say write a test around >> distinct and wait for it to break someday if Clojure changes the undefined >> behavior. >> >> On Wed, Dec 28, 2016 at 9:55 AM, Michael Blume >> wrote: >> >>> Also, I'm assuming distinct uses .equals semantics which might be worth >>> calling out in the doc >>> >>> On Wed, Dec 28, 2016, 11:22 AM Mike Rodriguez wrote: >>> The doc for `distinct` is: "Returns a lazy sequence of the elements of coll with duplicates removed. Returns a stateful transducer when no collection is provided." (1) In the lazy sequence case, I've thought that maybe it is assuemd there is a guarantee that the order of the input seq is preserved. However, this isn't stated. Is this an assumption to rely on for `distinct` and, more generally, the Clojure seq-based API functions? (2) In either case, when there are duplicates, there do not seem to be any guarantees on which one of the duplicates will be preserved. Should this be stated? I'm thinking that maybe this is about Clojure's design philosophy being that equal values to not ever need to be distinguished between, so the API doesn't explicitly support this concern. However, there are times when identity relationships can matter - performance would be one that comes to mind. - This has some relationship to the Scala question @ http://stackoverflow.com/questions/6735568/scala-seqlike- distinct-preserves-order There have been a few occasions where I relied on (or wanted to rely on) (1). I haven't had many cases where (2) matters, but I could see it coming up on perhaps rare occasions. -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clo...@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+u...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups "Clojure" group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+u...@googlegroups.com. For more options, visit https://groups.google.com/d/optout. >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "Clojure" group. >>> To post to this group, send email to clo...@googlegroups.com >>> Note that posts from new members are moderated - please be patient with >>> your first post. >>> To unsubscribe from this group, send email to >>> clojure+u...@googlegroups.com >>> For more options, visit this group
Re: Order preservation and duplicate removal policy in `distinct`
> I'm just guessing there the answer may just be "equal values are equal and > you should never care which one you get out". There are times to care > though, but then perhaps just don't use `distinct` or be sure to have a test > on it. :P Can you provide a scenario when it matters? Given that you had two immutable, equal values in a collection, when would it matter which one was discarded and which one was kept? Sean Corfield -- (970) FOR-SEAN -- (904) 302-SEAN An Architect's View -- http://corfield.org/ "If you're not annoying somebody, you're not really alive." -- Margaret Atwood On 12/29/16, 1:15 PM, "Mike Rodriguez" wrote: Yeah, adding a test for undocumented behavior seems somewhat reasonable. I do wish the docs would be a bit clearer on these aspects of the contract. Without that it just doesn't seem that there is any real commitment to the Clojure implementation to not change later. I understand the general idea of "it is the most natural way to implement it". However, that is shaky grounds to rely on. It also makes me question if I've really thought it through to know "there isn't any other way to do it". In the case of `distinct` I can be fairly sure it won't reorder them. Also, it still isn't clear if it keeps the first or later duplicates or not. That was the (2) part of the question. I'm just guessing there the answer may just be "equal values are equal and you should never care which one you get out". There are times to care though, but then perhaps just don't use `distinct` or be sure to have a test on it. :P I asked this question mostly out of curiosity and to see what others thought. Also, to bring up the issue of if the docs are sufficient. -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups "Clojure" group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Order preservation and duplicate removal policy in `distinct`
Yeah, adding a test for undocumented behavior seems somewhat reasonable. I do wish the docs would be a bit clearer on these aspects of the contract. Without that it just doesn't seem that there is any real commitment to the Clojure implementation to not change later. I understand the general idea of "it is the most natural way to implement it". However, that is shaky grounds to rely on. It also makes me question if I've really thought it through to know "there isn't any other way to do it". In the case of `distinct` I can be fairly sure it won't reorder them. Also, it still isn't clear if it keeps the first or later duplicates or not. That was the (2) part of the question. I'm just guessing there the answer may just be "equal values are equal and you should never care which one you get out". There are times to care though, but then perhaps just don't use `distinct` or be sure to have a test on it. :P I asked this question mostly out of curiosity and to see what others thought. Also, to bring up the issue of if the docs are sufficient. On Wednesday, December 28, 2016 at 3:38:03 PM UTC-6, tbc++ wrote: > > This is one of those odd questions where the answer of what "could" happen > and what "will most likely happen" are completely different. There is no > reason why `distinct` should reorder or which item will be preserved. > However there's really only one logical way to implement this (the way it's > currently implemented) and in that case the answer would be "the first > duplicate is used", and "items are not reordered". > > So all that to say, there's nothing in the docs that specify this is the > way it has to be, but it is the way it is now, is most likely not going to > change. So if you're really concerned about it, I'd say write a test around > distinct and wait for it to break someday if Clojure changes the undefined > behavior. > > On Wed, Dec 28, 2016 at 9:55 AM, Michael Blume > wrote: > >> Also, I'm assuming distinct uses .equals semantics which might be worth >> calling out in the doc >> >> On Wed, Dec 28, 2016, 11:22 AM Mike Rodriguez > > wrote: >> >>> The doc for `distinct` is: >>> "Returns a lazy sequence of the elements of coll with duplicates removed. >>> Returns a stateful transducer when no collection is provided." >>> >>> (1) In the lazy sequence case, I've thought that maybe it is assuemd >>> there is a guarantee that the order of the input seq is preserved. >>> However, this isn't stated. Is this an assumption to rely on for >>> `distinct` and, more generally, the Clojure seq-based API functions? >>> >>> (2) In either case, when there are duplicates, there do not seem to be >>> any guarantees on which one of the duplicates will be preserved. Should >>> this be stated? I'm thinking that maybe this is about Clojure's design >>> philosophy being that equal values to not ever need to be distinguished >>> between, so the API doesn't explicitly support this concern. However, >>> there are times when identity relationships can matter - performance would >>> be one that comes to mind. >>> - This has some relationship to the Scala question @ >>> http://stackoverflow.com/questions/6735568/scala-seqlike-distinct-preserves-order >>> >>> There have been a few occasions where I relied on (or wanted to rely on) >>> (1). I haven't had many cases where (2) matters, but I could see it coming >>> up on perhaps rare occasions. >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "Clojure" group. >>> To post to this group, send email to clo...@googlegroups.com >>> >>> Note that posts from new members are moderated - please be patient with >>> your first post. >>> To unsubscribe from this group, send email to >>> clojure+u...@googlegroups.com >>> For more options, visit this group at >>> http://groups.google.com/group/clojure?hl=en >>> --- >>> You received this message because you are subscribed to the Google >>> Groups "Clojure" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to clojure+u...@googlegroups.com . >>> For more options, visit https://groups.google.com/d/optout. >>> >> -- >> You received this message because you are subscribed to the Google >> Groups "Clojure" group. >> To post to this group, send email to clo...@googlegroups.com >> >> Note that posts from new members are moderated - please be patient with >> your first post. >> To unsubscribe from this group, send email to >> clojure+u...@googlegroups.com >> For more options, visit this group at >> http://groups.google.com/group/clojure?hl=en >> --- >> You received this message because you are subscribed to the Google Groups >> "Clojure" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to clojure+u...@googlegroups.com . >> For more options, visit https://groups.google.com/d/optout. >> > > > > -- > “One of the main causes of the fall of th
Re: Order preservation and duplicate removal policy in `distinct`
This is one of those odd questions where the answer of what "could" happen and what "will most likely happen" are completely different. There is no reason why `distinct` should reorder or which item will be preserved. However there's really only one logical way to implement this (the way it's currently implemented) and in that case the answer would be "the first duplicate is used", and "items are not reordered". So all that to say, there's nothing in the docs that specify this is the way it has to be, but it is the way it is now, is most likely not going to change. So if you're really concerned about it, I'd say write a test around distinct and wait for it to break someday if Clojure changes the undefined behavior. On Wed, Dec 28, 2016 at 9:55 AM, Michael Blume wrote: > Also, I'm assuming distinct uses .equals semantics which might be worth > calling out in the doc > > On Wed, Dec 28, 2016, 11:22 AM Mike Rodriguez wrote: > >> The doc for `distinct` is: >> "Returns a lazy sequence of the elements of coll with duplicates removed. >> Returns a stateful transducer when no collection is provided." >> >> (1) In the lazy sequence case, I've thought that maybe it is assuemd >> there is a guarantee that the order of the input seq is preserved. >> However, this isn't stated. Is this an assumption to rely on for >> `distinct` and, more generally, the Clojure seq-based API functions? >> >> (2) In either case, when there are duplicates, there do not seem to be >> any guarantees on which one of the duplicates will be preserved. Should >> this be stated? I'm thinking that maybe this is about Clojure's design >> philosophy being that equal values to not ever need to be distinguished >> between, so the API doesn't explicitly support this concern. However, >> there are times when identity relationships can matter - performance would >> be one that comes to mind. >> - This has some relationship to the Scala question @ >> http://stackoverflow.com/questions/6735568/scala- >> seqlike-distinct-preserves-order >> >> There have been a few occasions where I relied on (or wanted to rely on) >> (1). I haven't had many cases where (2) matters, but I could see it coming >> up on perhaps rare occasions. >> >> -- >> You received this message because you are subscribed to the Google >> Groups "Clojure" group. >> To post to this group, send email to clojure@googlegroups.com >> Note that posts from new members are moderated - please be patient with >> your first post. >> To unsubscribe from this group, send email to >> clojure+unsubscr...@googlegroups.com >> For more options, visit this group at >> http://groups.google.com/group/clojure?hl=en >> --- >> You received this message because you are subscribed to the Google Groups >> "Clojure" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to clojure+unsubscr...@googlegroups.com. >> For more options, visit https://groups.google.com/d/optout. >> > -- > You received this message because you are subscribed to the Google > Groups "Clojure" group. > To post to this group, send email to clojure@googlegroups.com > Note that posts from new members are moderated - please be patient with > your first post. > To unsubscribe from this group, send email to > clojure+unsubscr...@googlegroups.com > For more options, visit this group at > http://groups.google.com/group/clojure?hl=en > --- > You received this message because you are subscribed to the Google Groups > "Clojure" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to clojure+unsubscr...@googlegroups.com. > For more options, visit https://groups.google.com/d/optout. > -- “One of the main causes of the fall of the Roman Empire was that–lacking zero–they had no way to indicate successful termination of their C programs.” (Robert Firth) -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups "Clojure" group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Order preservation and duplicate removal policy in `distinct`
Also, I'm assuming distinct uses .equals semantics which might be worth calling out in the doc On Wed, Dec 28, 2016, 11:22 AM Mike Rodriguez wrote: > The doc for `distinct` is: > "Returns a lazy sequence of the elements of coll with duplicates removed. > Returns a stateful transducer when no collection is provided." > > (1) In the lazy sequence case, I've thought that maybe it is assuemd there > is a guarantee that the order of the input seq is preserved. However, this > isn't stated. Is this an assumption to rely on for `distinct` and, more > generally, the Clojure seq-based API functions? > > (2) In either case, when there are duplicates, there do not seem to be any > guarantees on which one of the duplicates will be preserved. Should this > be stated? I'm thinking that maybe this is about Clojure's design > philosophy being that equal values to not ever need to be distinguished > between, so the API doesn't explicitly support this concern. However, > there are times when identity relationships can matter - performance would > be one that comes to mind. > - This has some relationship to the Scala question @ > http://stackoverflow.com/questions/6735568/scala-seqlike-distinct-preserves-order > > There have been a few occasions where I relied on (or wanted to rely on) > (1). I haven't had many cases where (2) matters, but I could see it coming > up on perhaps rare occasions. > > -- > You received this message because you are subscribed to the Google > Groups "Clojure" group. > To post to this group, send email to clojure@googlegroups.com > Note that posts from new members are moderated - please be patient with > your first post. > To unsubscribe from this group, send email to > clojure+unsubscr...@googlegroups.com > For more options, visit this group at > http://groups.google.com/group/clojure?hl=en > --- > You received this message because you are subscribed to the Google Groups > "Clojure" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to clojure+unsubscr...@googlegroups.com. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups "Clojure" group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.