Also, after some research, I found that HashMultisets also have some area of application in image detection and event simulations
вс, 21 апр. 2024 г. в 01:19, ІП-24 Олександр Ротань < rotan.olexa...@gmail.com>: > Sorry for duplicate, I accidentally sent last message only to you > > вс, 21 апр. 2024 г. в 01:19, ІП-24 Олександр Ротань < > rotan.olexa...@gmail.com>: > >> Firstly, about queues. As far as I know, priority queue is implemented as >> heap, not tree, so things like subQueue doesn't make much sense in this >> context. >> >> About popularity: this is indeed not very popular, might even be >> negligible outside of sorted multiset (lets adopt java naming conventions >> for those from now on). Although, I could provide at least few examples: >> >> 1. Batch tasks processing, especially in concurrent environments, when >> tasks could have different priorities. Sometimes tasks that are stored in a >> queue are processed in batches. Using today's stdlib of Java, >> acquiring such a batch of tasks from collection would be a O(n * log n) >> complexity, because each heapification is log n and called after each task >> is popped. Using TreeMultiset, this could be achieved in log (n). This also >> applies to messaging. >> >> 2. Implementing custom data structures. Not a task most of commercial >> developers encounter, although for specific types of tasks, like big data >> processing, custom datastructs could be a thing. Some collections, >> expeciall ordered, like B-tree, could require sorted multiset as part of >> internal implementation. >> >> 3. Again, for sorted multisets, counting frequency is a common task for >> commercial developers. Counting element frequencies using multisets is the >> thing that simply couldn't be achieved by set and is inefficient in list. >> This could have various use cases, from data displaying in histograms in >> generated reports to data validation is stateful systems (checking if user >> has not exceeded permitted amount of attempts, for example). >> >> 4. Storage for the result of db queries. Another common task is to fetch >> some ordered data from the database, and then modify it to preserve order. >> This category of tasks could be optimized using ordered multisets. >> >> 5. Concurent bag, as you just mentioned. This could be used in concurrent >> environment as I mentioned above, or just as less restricting >> concurrent collection as preserving sequence of elements could be a complex >> and resource consuming task in mutable concurrent collections >> >> As you might see, almost all of those indeed converge to a sorted >> collection of duplicating elements. I am not sure If there is much more to >> it, but what i would like to say is that feature like this would definitely >> not be redundant for "enterprise" language like Java, where optimizing some >> operations from O(n) or even O(n * log n) to O (log n) could introduce >> significant performance improvement, as well as reducing amount of possible >> bugs by taking on some of responsibilities related to preserving order of >> elements sorted. >> >> вс, 21 апр. 2024 г. в 00:33, Holo The Sage Wolf <holo3...@gmail.com>: >> >> вс, 21 апр. 2024 г. в 01:06, ІП-24 Олександр Ротань < >> rotan.olexa...@gmail.com>: >> >>> Firstly, about queues. As far as I know, priority queue is implemented >>> as heap, not tree, so things like subQueue doesn't make much sense in this >>> context. >>> >>> About popularity: this is indeed not very popular, might even be >>> negligible outside of sorted multiset (lets adopt java naming conventions >>> for those from now on). Although, I could provide at least few examples: >>> >>> 1. Batch tasks processing, especially in concurrent environments, when >>> tasks could have different priorities. Sometimes tasks that are stored in a >>> queue are processed in batches. Using today's stdlib of Java, >>> acquiring such a batch of tasks from collection would be a O(n * log n) >>> complexity, because each heapification is log n and called after each task >>> is popped. Using TreeMultiset, this could be achieved in log (n). This also >>> applies to messaging. >>> >>> 2. Implementing custom data structures. Not a task most of commercial >>> developers encounter, although for specific types of tasks, like big data >>> processing, custom datastructs could be a thing. Some collections, >>> expeciall ordered, like B-tree, could require sorted multiset as part of >>> internal implementation. >>> >>> 3. Again, for sorted multisets, counting frequency is a common task for >>> commercial developers. Counting element frequencies using multisets is the >>> thing that simply couldn't be achieved by set and is inefficient in list. >>> This could have various use cases, from data displaying in histograms in >>> generated reports to data validation is stateful systems (checking if user >>> has not exceeded permitted amount of attempts, for example). >>> >>> 4. Storage for the result of db queries. Another common task is to fetch >>> some ordered data from the database, and then modify it to preserve order. >>> This category of tasks could be optimized using ordered multisets. >>> >>> 5. Concurent bag, as you just mentioned. This could be used in >>> concurrent environment as I mentioned above, or just as less restricting >>> concurrent collection as preserving sequence of elements could be a complex >>> and resource consuming task in mutable concurrent collections >>> >>> As you might see, almost all of those indeed converge to a sorted >>> collection of duplicating elements. I am not sure If there is much more to >>> it, but what i would like to say is that feature like this would definitely >>> not be redundant for "enterprise" language like Java, where optimizing some >>> operations from O(n) or even O(n * log n) to O (log n) could introduce >>> significant performance improvement, as well as reducing amount of possible >>> bugs by taking on some of responsibilities related to preserving order of >>> elements sorted. >>> >>> вс, 21 апр. 2024 г. в 00:33, Holo The Sage Wolf <holo3...@gmail.com>: >>> >>>> Let me prefix this by saying that I'm a mathematician, so maybe the >>>> lingo I use is a bit different. A multi set has no structure apart from >>>> "counting duplication", just like a set has no structure at all, ordered >>>> set **is not a set** and ordered multi set **is not a multi set**. >>>> >>>> With the clarification you gave I understand what you mean and retract >>>> my first question (although I believe that "sortedX" or "priorityX" are the >>>> naming Java has for what you called "orderedX"). >>>> >>>> > THe whole point of the bag is to store duplicate elements without >>>> preserving order, which allows things like ordered collections. >>>> >>>> No, the whole point of the bag is to store elements. The difference >>>> between a bag and a list is that a bag has less, anything you can do with a >>>> bag you can do with a list. >>>> >>>> The benefit of having a bag is to have a more restrictive interface for >>>> the times you don't need the extra structure, as well as the possible >>>> performant implementation that doesn't need to respect a stricted contract. >>>> >>>> CopyOnWriteArratList is more than just a bag. >>>> >>>> >>>> > then that just adds one more use case to potential Bag interface >>>> >>>> Indeed, I gave concurrent bag as an example of a possible >>>> implementation of a bag interface, but you didn't answer my main question, >>>> just how common is a use if a bag to change the std for it? >>>> >>>> To me it sounds like you are interested only on the sorted/navigable >>>> variations, so an alternative solution will be to improve PriorityQueue to >>>> have a "subQueue" methods (and friends) as well as extending it to a >>>> Navigable variation. >>>> >>>> >>>> >>>> >>>> On Sun, 21 Apr 2024, 00:01 ІП-24 Олександр Ротань, < >>>> rotan.olexa...@gmail.com> wrote: >>>> >>>>> Concurrent Bag is something like CopyOnWriteArrayList or Vector, don't >>>>> we have it already? (I know vectors aren't optimized). THe whole point of >>>>> the bag is to store duplicate elements without preserving order, which >>>>> allows things like ordered collections. There could be mutable collections >>>>> for concurrency optimized with read-write locks though, if that is what >>>>> you >>>>> are referring to as "concurrent bag", and I am missing the point why this >>>>> can't be a part of list interface, then that just adds one more use case >>>>> to >>>>> potential Bag interface. >>>>> >>>>> PS: Ordered means it preserves order based on comparator, what you are >>>>> referring to as ordered essentially means sequential, or am I missing a >>>>> point? >>>>> >>>>> сб, 20 апр. 2024 г. в 23:52, Holo The Sage Wolf <holo3...@gmail.com>: >>>>> >>>>>> By "ordered collection" you mean "unordered collection"? >>>>>> >>>>>> How common is it to actually use a multiset in general? Of course >>>>>> there are use cases, and there are areas in which it is used a lot, but I >>>>>> don't believe it is common enough to be part of the std with the one >>>>>> exception of concurrent bag. >>>>>> >>>>>> On Sat, 20 Apr 2024, 23:25 ІП-24 Олександр Ротань, < >>>>>> rotan.olexa...@gmail.com> wrote: >>>>>> >>>>>>> In this letter I would like to express some of my thoughts regarding >>>>>>> the potential Multiset interface. >>>>>>> >>>>>>> I, personally, have encountered a few situations where such an >>>>>>> interface could come in handy, mostly when I needed an ordered >>>>>>> collection >>>>>>> that permits duplicates. That time I used guava`s TreeMultiset, but I >>>>>>> think >>>>>>> Java itself should have such a collection in its std library. While it >>>>>>> is >>>>>>> not a very common problem, there are still a bunch of use cases where >>>>>>> such >>>>>>> things could come in handy. >>>>>>> >>>>>>> I am willing to take on development of such thing, but there are a >>>>>>> few concerns about Multiset: >>>>>>> >>>>>>> 1. Is there any other use for this besides ordered collection that >>>>>>> permits duplicates? I can't remember anything else from the top of my >>>>>>> head. >>>>>>> >>>>>>> 2. Guava's TreeMultiset class hierarchy pretty much imitates TreeSet >>>>>>> class hierarchy, while not being directly connected. I think introducing >>>>>>> any other ordered collection will require some refactoring of existing >>>>>>> collection interfaces (for example extract SortedCollection from >>>>>>> SortedSet, >>>>>>> Navigable Collection from NavigableSet etc.). From the perspective of >>>>>>> clean >>>>>>> code, this would be the right decision, but I feel like this will be a >>>>>>> very >>>>>>> complex task to accomplish. >>>>>>> >>>>>>> 3. Maybe there should be few versions of Tree collection (for >>>>>>> example, for regular tasks red-black tree and B-Tree for large amounts >>>>>>> of >>>>>>> data). I have some expirience implementing both, but is it really >>>>>>> needed in >>>>>>> standard library? >>>>>>> >>>>>>