Evolution of Collection Types in Swifts
Understand the protocols that make Arrays, Dictionaries, and Sets work
The three primary collection types in Swift — Array
, Dictionary
and Set
— need no introduction. But if you’ve ever explored protocol conformance hierarchy for any of them, you may have noticed that it may be quite long and seemingly overcomplicated. For example, for Array
, it may look like following (due to multiple protocol conformances, there can be multiple paths, and this is not the longest one):
Sequence <- Collection <- MutableCollection <- Array
Most of the methods and properties available for Array
are spread out over those protocols. But what is the point of having so many stages of ripening of Array
’s functionality? Let’s start from the beginning and figure out which new semantics and new API are brought in with each of those stages.
https://www.fareva.com/knv/video-MU-v-S-U-tvs-0009-x1.html
https://www.fareva.com/knv/video-MU-v-S-U-tvs-0009-x2.html
https://www.fareva.com/knv/video-MU-v-S-U-tvs-0009-x3.html
https://www.fareva.com/knv/video-MU-v-S-U-tvs-0009-x4.html
https://www.fareva.com/knv/video-MU-v-S-U-tvs-0009-x5.html
https://www.fareva.com/knv/video-ro-v-tori-tv-gratis0011.html
https://www.fareva.com/knv/video-ro-v-tori-tv-gratis0012.html
https://www.fareva.com/knv/video-ro-v-tori-tv-gratis0013.html
https://www.fareva.com/knv/video-ro-v-tori-tv-gratis0014.html
https://www.fareva.com/knv/video-ro-v-tori-tv-gratis0015.html
https://www.fareva.com/knv/Videos-Willem-v-Vitesse-nl01.html
https://www.fareva.com/knv/Videos-Willem-v-Vitesse-nl02.html
https://www.fareva.com/knv/Videos-Willem-v-Vitesse-nl03.html
https://www.fareva.com/knv/Videos-Willem-v-Vitesse-nl04.html
https://www.fareva.com/knv/Videos-Willem-v-Vitesse-nl05.html
https://phoenixvilleseniorcenter.org/cvt/video-MU-v-S-U-tvs-0009-x1.html
https://phoenixvilleseniorcenter.org/cvt/video-MU-v-S-U-tvs-0009-x2.html
https://phoenixvilleseniorcenter.org/cvt/video-MU-v-S-U-tvs-0009-x3.html
https://phoenixvilleseniorcenter.org/cvt/video-MU-v-S-U-tvs-0009-x4.html
https://phoenixvilleseniorcenter.org/cvt/video-MU-v-S-U-tvs-0009-x5.html
https://phoenixvilleseniorcenter.org/cvt/Videos-Willem-v-Vitesse-nl01.html
https://phoenixvilleseniorcenter.org/cvt/Videos-Willem-v-Vitesse-nl02.html
https://phoenixvilleseniorcenter.org/cvt/Videos-Willem-v-Vitesse-nl03.html
https://phoenixvilleseniorcenter.org/cvt/Videos-Willem-v-Vitesse-nl04.html
https://phoenixvilleseniorcenter.org/cvt/Videos-Willem-v-Vitesse-nl05.html
https://www.visualcom.it/kux/video-ro-v-tori-tv-gratis0011.html
https://www.visualcom.it/kux/video-ro-v-tori-tv-gratis0012.html
https://www.visualcom.it/kux/video-ro-v-tori-tv-gratis0013.html
https://www.visualcom.it/kux/video-ro-v-tori-tv-gratis0014.html
https://www.visualcom.it/kux/video-ro-v-tori-tv-gratis0015.html
http://trob.be/ktx/video-m-v-c-be801.html
http://trob.be/ktx/video-m-v-c-be802.html
http://trob.be/ktx/video-m-v-c-be803.html
http://trob.be/ktx/video-m-v-c-be804.html
http://trob.be/ktx/video-m-v-c-be805.html
https://www.fareva.com/mkp/video-m-v-c-be801.html
https://www.fareva.com/mkp/video-m-v-c-be802.html
https://www.fareva.com/mkp/video-m-v-c-be803.html
https://www.fareva.com/mkp/video-m-v-c-be804.html
https://www.fareva.com/mkp/video-m-v-c-be805.html
https://phoenixvilleseniorcenter.org/cdf/video-Br-v-Mc-tv1.html
https://phoenixvilleseniorcenter.org/cdf/video-Br-v-Mc-tv2.html
https://phoenixvilleseniorcenter.org/cdf/video-Br-v-Mc-tv3.html
https://phoenixvilleseniorcenter.org/cdf/video-Br-v-Mc-tv4.html
https://phoenixvilleseniorcenter.org/cdf/video-Br-v-Mc-tv5.html
https://phoenixvilleseniorcenter.org/cdf/video-ro-v-tori-itc-091.html
https://phoenixvilleseniorcenter.org/cdf/video-ro-v-tori-itc-092.html
https://phoenixvilleseniorcenter.org/cdf/video-ro-v-tori-itc-093.html
https://phoenixvilleseniorcenter.org/cdf/video-ro-v-tori-itc-094.html
https://phoenixvilleseniorcenter.org/cdf/video-ro-v-tori-itc-095.html
https://phoenixvilleseniorcenter.org/cdf/video-milano-v-night-fpt-01.html
https://phoenixvilleseniorcenter.org/cdf/video-milano-v-night-fpt-02.html
https://phoenixvilleseniorcenter.org/cdf/video-milano-v-night-fpt-03.html
https://phoenixvilleseniorcenter.org/cdf/video-milano-v-night-fpt-04.html
https://phoenixvilleseniorcenter.org/cdf/video-milano-v-night-fpt-05.html
https://phoenixvilleseniorcenter.org/cdf/video-Udc-v-La-cl-01.html
https://phoenixvilleseniorcenter.org/cdf/video-Udc-v-La-cl-02.html
https://phoenixvilleseniorcenter.org/cdf/video-Udc-v-La-cl-03.html
https://phoenixvilleseniorcenter.org/cdf/video-Udc-v-La-cl-04.html
https://phoenixvilleseniorcenter.org/cdf/video-Udc-v-La-cl-05.html
https://skatepowerplay.com/tdn/video-Br-v-Mc-tv1.html
https://skatepowerplay.com/tdn/video-Br-v-Mc-tv2.html
https://skatepowerplay.com/tdn/video-Br-v-Mc-tv3.html
https://skatepowerplay.com/tdn/video-Br-v-Mc-tv4.html
https://skatepowerplay.com/tdn/video-Br-v-Mc-tv5.html
https://skatepowerplay.com/tdn/video-ro-v-tori-itc-091.html
https://skatepowerplay.com/tdn/video-ro-v-tori-itc-092.html
https://skatepowerplay.com/tdn/video-ro-v-tori-itc-093.html
https://skatepowerplay.com/tdn/video-ro-v-tori-itc-094.html
https://skatepowerplay.com/tdn/video-ro-v-tori-itc-095.html
https://skatepowerplay.com/tdn/video-milano-v-night-fpt-01.html
https://skatepowerplay.com/tdn/video-milano-v-night-fpt-02.html
https://skatepowerplay.com/tdn/video-milano-v-night-fpt-03.html
https://skatepowerplay.com/tdn/video-milano-v-night-fpt-04.html
https://skatepowerplay.com/tdn/video-milano-v-night-fpt-05.html
https://skatepowerplay.com/tdn/video-Udc-v-La-cl-01.html
https://skatepowerplay.com/tdn/video-Udc-v-La-cl-02.html
https://skatepowerplay.com/tdn/video-Udc-v-La-cl-03.html
https://skatepowerplay.com/tdn/video-Udc-v-La-cl-04.html
https://skatepowerplay.com/tdn/video-Udc-v-La-cl-05.html
https://www.visualcom.it/snf/video-ro-v-tori-itc-091.html
https://www.visualcom.it/snf/video-ro-v-tori-itc-092.html
https://www.visualcom.it/snf/video-ro-v-tori-itc-093.html
https://www.visualcom.it/snf/video-ro-v-tori-itc-094.html
https://www.visualcom.it/snf/video-ro-v-tori-itc-095.html
https://www.visualcom.it/snf/video-milano-v-night-fpt-01.html
https://www.visualcom.it/snf/video-milano-v-night-fpt-02.html
https://www.visualcom.it/snf/video-milano-v-night-fpt-03.html
https://www.visualcom.it/snf/video-milano-v-night-fpt-04.html
https://www.visualcom.it/snf/video-milano-v-night-fpt-05.html
Sequence
Being the root protocol for all collection types,Sequence
is, so to speak, a proto-collection, which contains only basic rudiments to grow into something more. Pretty much as its name suggests, Sequence
models a list of values that you can sequentially step through, one at a time. The only method any Sequence
must implement is makeIterator()
, which returns an instance of IteratorProtocol
— the associated type of Sequence
protocol.
IteratorProtocol
in turn has the only required method, next()
, returning the next element each time it gets called or nil if no next element exists. By the way, you can make your custom type conform to both Sequence
and IteratorProtocol
. Then you only have to implement next()
, while makeIterator()
would be inferred automatically. This really simple mechanism is exactly what a for-in
loop exploits under the hood. So if you are for
-looping over something, this something is a Sequence
. for-in
syntax is not necessary, of course; you can well use iterators explicitly.
So, strictly speaking, Sequence
does not necessarily contain some list of elements; rather, it represents some abstract ability to generate them sequentially on the fly. For example, you can create the sequence OddNumbers
whose Element
is Int
. The first element would be 1, and each call of next()
would increment the previous one by 2 and return it. Working with this basic sequence also reminds me of reading from a file in some old-fashioned environment. You read it bite by bite until you stumble upon the end-of-file symbol, which is an analogue of calling iterator.next()
until it returnsnil
.
While these basics of Sequence
seem very simple, they are just enough to develop into a rich and flourishing API, available for any sequence for free. The standard library provides default implementations for a large number of useful operations, starting from searching (contains()
, first(where:)
, etc.) and ending with filter
, map
and reduce
— those pillars of functional programming. You can also shuffle, reverse, join, split your sequence, drop its elements, extract prefixes and suffixes, and many many more. And of course now with Combine, you can get a Publisher
for your sequence for free.
There is an important disclaimer about the Sequence
protocol: There is no requirement for whether it will be destructively consumed by iteration. Meaning, if you’ve already performed a for-in
loop over an instance of the sequence, and you then start another one over the same instance, you can’t be sure whether it will resume iterating, or start it over, or do nothing more — the behaviour you get is just undefined. If you do need such non-destructive iteration, Collection
is here to serve you, and we’ll get back to this topic soon.
Collection
Collection
protocol inherits from Sequence
, adding the requirement that any element can be accessed directly by its position, which secures more elaborate and controlled management of its elements. Semantically, collection models a pre-defined, finite list of elements, each stuck to a certain position. It doesn’t just have an ability to iterate, as Sequence
does.
To give you another illustration, Sequence
can be viewed as a stack of paperwork on your desk that you can only go through from top to bottom, checking out each item until you reach what you’re looking for. In contrast, Collection
is something that reflects some expanding file folder, with labels on each pocket, bookmarks, and even a table of contents for faster reference.
For that purpose, Collection
introduces the concept of indices. It declares startIndex
and endIndex
properties to define the indices bounds, and the index(after:)
method for advancing the index. Note that those methods make the indexing work for and arbitrary type, not necessarily for integers. Index
is the associated type of the Collection
protocol, and the only restriction for type taking its place is to be Comparable
. Finally, there is at least a read-only subscript used for accessing the element at a given index. And it can be considered the cornerstone of Collection
’s functionality.
In addition, Collection
declares and provides default implementations for the next tranche of cool out-of-the-box APIs that take advantage of indexing. It’s not only accessing by index but also prefixing, suffixing, slicing, determining the distance between indices, etc. Even count
first appears in Collection
. For Sequence
, we have only underestimatedCount
, which is by definition less than or equal to the estimated number of elements, and by default it is zero. ’Cause you never know when your iterator’s next()
is going to return its final nil
.
Having indexed access naturally implies another requirement that brings Collection
to the next level as compared to Sequence
. Unlike what we have for Sequence
, types conforming to Collection
must guarantee that each time you traverse it with a for-in
loop (or using an iterator), iteration will start over and you’ll be getting the same results as the previous time (of course, unless you didn’t modify it in between these two loops).
In this example, the first for-in
loop has not consumed the first two elements of the array. So when you relaunch the iterating, it will start from the beginning. Remember that for Sequence
, the behaviour of the second loop is just not defined. Also notice another way of iterating over Collection
with the use of index and subscript, in addition to the basic one inherited from Sequence
.
Let me stay a little more on this deep connection between the ability for direct access by index and the requirement of nondestructive, repeatable traversing. Technically, some basic indexed access is applicable not only to Collection
but to Sequence
as well. You can declare a separate counter that you increment and check on each iteration of for-in
and trigger logic you need once it reached your desired index value.
But not to mention how awkward it is, this is hardly usable for real cases at all, and that’s just because Sequence
by default can’t be repeatedly traversed. Normally you use an index either if it was saved at some previous point, resorting to your Collection
, or to save the index for future access to the same element. So this is at least a two-step process — save the index and use the index, in any order — which presumes that your list of objects is durable and stable enough to be iterated over repeatedly. Otherwise, there is no reason for indexing it, even if technically you can.
Collection Types
So far we mostly talked about protocol structure and meaning, but let’s finally turn to concrete collection types. Now with all that sacred knowledge about collection protocols, you should be able to create your custom type, but you probably won’t. The standard ones (Array
, Dictionary
, and Set
) are pretty exhaustive, well optimised, and just enough in 99.9 of cases. Let’s see how exactly they are inscribed into the complete collection protocols hierarchy.
Collection types family tree
As you can see now and as mentioned in the intro, Array
is connected with the Collection
protocol through the number of transitional forms. In this article, we’re not going to focus on all of them, but if you wish to go into it deeper, you may check out this great article by Ole Begemann. But here let’s highlight at least some interesting things.
As you can see from the diagram above, one of the conformances of Array
is the MutableCollection
protocol. Well, it doesn’t seem counterintuitive at all; of course Array
is mutable! But if your first association was methods like append()
, or insert()
, or remove()
, this is not the case. The thing is, documentation of the MutableCollection
protocol postulates that changing values of its elements should not affect the length of Collection
. But obviously, append()
, insert()
, and remove()
do, so they don’t fall within this protocol. They’re declared and implemented in Array
itself. What about MutableCollection
? It is mostly about having write access for subscript
, in addition to the read-only access inherited from Collection
.
That is also why Set
, being actually mutable, bypasses MutableCollection
and immediately conforms to Collection
. But what about Dictionary
? It does have a subscript that sets the value for the key, replacing the previous one and thus not affecting the size. What is wrong with it, then?
Well, here comes another subtlety, and it’s time for a more careful look at the Index
of the dictionary. This is a bit tangled at first sight, but Key
in Dictionary
has nothing to do with Collection.Index
, just like Value
is completely unrelated to Sequence.Element
. So if you declare, let’s say, Dictionary<Int, String>
, its Key
would be Int
, its Value
is String
, and its Element
is the tuple (key: Int, value: String)
.
And what about Index
? It is a separate struct, internally declared in the dictionary, in our caseDictionary<Int, String>.Index
. This struct basically holds all that hashing-related stuff that the internal implementation of Dictionary
relies on in order to be a hash table. By the way, something similar also applies to Set.Index
, which we also rarely use directly.
So, key and value are something specific to Dictionary
, and they belong to its implementation, not to that of the Collection
protocol. Hence, the widely used get-set subscript(Key)->Value?
of Dictionary
is not in any way related to the read-only subscript(Index) -> Element
, required by the Collection
protocol. Of course, the latter is also still in play for Dictionar
, and you can well use it (as well as all other operations with Index
inherited from Collection
). But that’s not what we normally do, because Key
and Value
are here to make our lives easier. Here’s a short code snippet to summarise all this:
By the way, if you were ever wondering why the subscript of Dictionary
is optional while the apparently equivalent one of Array
is not, you’ve got your answer — they are not equivalent.
Sequence vs Collection — Some More Variations on a Theme
If you’re not yet overburdened with the above, let me share some more speculations on similarities and differences between the Sequence
and Collection
protocols. Hopefully you’ll have some practical takeaways even from the vague allegations in the section that follows.
Being different semantically, these protocols also provide operations of a somewhat different nature. This creates a certain division of roles between them among the APIs of each concrete collection type that conforms to both protocols. Feeling which protocol some method or property is more inherent to may be helpful when it comes to designing your own API that involves collections. Although in most real-life cases you are just okay with specifying a concrete type, sometimes you do have to be less restrictive and put Collection
, or even less restrictive and use Sequence
, depending on which behaviour or portion of their API is expected or necessary in this particular scope. After all, keeping your API as generic as possible is a good practice, and it would never harm if used wisely.
So the general rule of thumb, as you may have already guessed, would be that Collection
is primarily used when you have some knowledge of certain elements’ positions or need to gain this knowledge for future use, whereas Sequence
is just enough if you’re okay to flip through all of them from the beginning.
But here are some further reflections. Notice that the most typical operations over Sequence
cover the cases where you deal with the multitude of its elements as a whole, like converting it all with map
, digesting it all with reduce
, boiling it down with filter
, etc. Even when you search for an individual element of Sequence
(e.g., with first(where:)
), internally it still tests all of them one by one, so it generally still interacts with them as a list.
To put it mathematically (just in case you also like making simple things complicated), such operations are effectively the functions in which input is a list of elements and output is also a list of elements. It’s just a particular case that the result may happen to be an empty list, or that it is forcibly trimmed to be just one element. For example, first(where:)
, the primary means of accessing a certain element in the sequence, is just a trimmed version of filter(where:)
. It drops all but the first element of the results of filtering or returns nil
, if filtering resulted in an empty list. And, let’s say, contains(where:)
is just another way of saying whether the size of the list of elements retrieved is more than zero.
By contrast, Collection
is fundamentally aimed at giving an unambiguous and guaranteed result. The indexed subscript, the heart of the API of Collection
, provides one and only one element. If there is no such single element, it crashes. So it is not optional, unlike most of the semantically corresponding methods of Sequence
. This represents the mathematical idea of projection from a list of elements to a single and obligatory one. Again, this is not a strict rule or definition but more like trying to find some patterns and regularities.
Another distinction to be mentioned again is that only with Collection
do you first get the ability to know the number of its elements. It’s not possible with Sequence
because you never know beforehand when the iteration over it will reach the last element. For the same reason, be aware that Sequence
is conceptually infinite, so both traversing it and using most of its methods (that internally still involve traversing) are not quite safe. The next()
method of its iterator may keep returning a non-nil value till the end of time. This is not something you’re likely to run into when using sequences from a standard library, but when dealing with an unknown sequence, you can’t be 100% sure that you won’t get stuck inside of some method.
It looks like the only methods you can rely on are those dealing with a finite prefix (like dropFirst(Int)
and prefix(Int)
). By the way, not knowing the size of a sequence also prevents you from accessing the last element and any logic that involves a suffix. (Well, frankly speaking, there are still suffix()
and dropLast()
methods in Sequence
, but they require deliberate restriction of the receiver to be finite and are somewhat dissonant here theoretically).
And the last thing to mention for today is performance. Unless documented otherwise, Sequence
must provide its elements no worse than in O(n), while Collection
must make it to the top-notch O(1), which is another win of having indexed access. However, the great law of performance-memory trade-off predicts that Collection
is more expensive in terms of memory. Of course, it all depends on concrete implementations, but as a concept — yes it is, and that comes from the requirement of repeated access. It’s kind of a loose analogy, but Collection
can be considered more state-oriented (indeed, dealing with index implies saving the state), while Sequence
is more in tune with a functional approach.
Whew, that’s all I wanted to tell you. If you followed this far, I should be really proud for keeping your attention for that long! Share your thoughts and questions in the comments. Thanks for reading, and see you next time