Learn the principles of multi-threading with the GCD framework in Swift. Queues, tasks, groups everything you'll ever need I promise.
GCD concurrency tutorial for beginners
The Grand Central Dispatch (GCD, or just Dispatch) framework is based on the underlying thread pool design pattern. This means that there are a fixed number of threads spawned by the system - based on some factors like CPU cores - they're always available waiting for tasks to be executed concurrently. 🚦
Creating threads on the run is an expensive task so GCD organizes tasks into specific queues, and later on the tasks waiting on these queues are going to be executed on a proper and available thread from the pool. This approach leads to great performance and low execution latency. We can say that the Dispatch framework is a very fast and efficient concurrency framework designed for modern multicore hardwares and needs.
Concurrency, multi-tasking, CPU cores, parallelism and threads
A processor can run tasks made by you programmatically, this is usually called coding, developing or programming. The code executed by a CPU core is a thread. So your app is going to create a process that is made up from threads. 🤓
In the past a processor had one single core, it could only deal with one task at a time. Later on time-slicing was introduced, so CPU's could execute threads concurrently using context switching. As time passed by processors gained more horse power and cores so they were capable of real multi-tasking using parallelism. ⏱
Nowadays a CPU is a very powerful unit, it's capable of executing billions of tasks (cycles) per second. Because of this high availability speed Intel introduced a technology called hyperthreading. They divided CPU clock cycles between (usually two) processes running at the same time, so the number of available threads essentially doubled. 📈
As you can see concurrenct execution can be achieved with various techniques, but you don't need to care about that much. It's up to the CPU architecture how it solves concurrency, and it's the operating system's task how much thread is going to be spawned for the underlying thread pool. The GCD framework will hide all the complexity, but it's always good to understand the basic principles. 👍
Synchronous and asynchronous execution
Each work item can be executed either synchronously or asynchronously.
Have you ever heard of blocking and non-blocking code? This is the same situaton here. With synchronous tasks you'll block the execution queue, but with async tasks your call will instantly return and the queue can continue the execution of the remaining tasks (or work items as Apple calls them). 🚧
When a work item is executed synchronously with the sync method, the program waits until execution finishes before the method call returns.
Your function is most likely synchronous if it has a return value, so
func load() -> String is going to probably block the thing that runs on until the resources is completely loaded and returned back.
When a work item is executed asynchronously with the async method, the method call returns immediately.
Completion blocks are a good sing of async methods, for example if you look at this method
func load(completion: (String) -> Void) you can see that it has no return type, but the result of the function is passed back to the caller later on through a block.
This is a typical use case, if you have to wait for something inside your method like reading the contents of a huge file from the disk, you don't want to block your CPU, just because of the slow IO operation. There can be other tasks that are not IO heavy at all (math operations, etc.) those can be executed while the system is reading your file from the physical hard drive. 💾
With dispatch queues you can execute your code synchronously or asynchronously. With syncronous execution the queue waits for the work, with async execution the code returns immediately without waiting for the task to complete. ⚡️
As I mentioned before, GCD organizes task into queues, these are just like the queues at the shopping mall. On every dispatch queue, tasks will be executed in the same order as you add them to the queue - FIFO: the first task in the line will be executed first - but you should note that the order of completion is not guaranteed. Tasks will be completed according to the code complexity. So if you add two tasks to the queue, a slow one first and a fast one later, the fast one can finish before the slower one. ⌛️
Serial and concurrent queues
There are two types of dispatch queues. Serial queues can execute one task at a time, these queues can be utilized to synchronize access to a specific resource. Concurrent queues on the other hand can execute one or more tasks parallell in the same time. Serial queue is just like one line in the mall with one cashier, concurrent queue is like one single line that splits for two or more cashiers. 💰
Main, global and custom queues
The main queue is a serial one, every task on the main queue runs on the main thread.
Global queues are system provided concurrent queues shared through the operating system. There are exactly four of them organized by high, default, low priority plus an IO throttled background queue.
Custom queues can be created by the user. Custom concurrent queues always mapped into one of the global queues by specifying a Quality of Service property (QoS). In most of the cases if you want to run tasks in parallel it is recommended to use one of the global concurrent queues, you should only create custom serial queues.
System provided queues
- Serial main queue
- Concurrent global queues
- high priority global queue
- default priority global queue
- low priority global queue
- global background queue (io throttled)
Custom queues by quality of service
- userInteractive (UI updates) -> serial main queue
- userInitiated (async UI related tasks) -> high priority global queue
- default -> default priority global queue
- utility -> low priority global queue
- background -> global background queue
- unspecified (lowest) -> low priority global queue
Enough from the theory, let's see how to use the Dispatch framework in action! 🎬
How to use the DispatchQueue class in Swift?
Here is how you can get all the queues from above using the brand new GCD syntax available from Swift 3. Please note that you should always use a global concurrent queue instead of creating your own one, except if you are going to use the concurrent queue for locking with barriers to achieve thread safety, more on that later. 😳
How to get a queue?
So executing a task on a background queue and updating the UI on the main queue after the task finished is a pretty easy one using Dispatch queues.
Sync and async calls on queues
There is no big difference between sync and async methods on a queue. Sync is just an async call with a semaphore (explained later) that waits for the return value. A sync call will block, on the other hand an async call will immediately return. 🎉
Basically if you need a return value use sync, but in every other case just go with async. DEADLOCK WARNING: you should never call sync on the main queue, because it'll cause a deadlock and a crash. You can use this snippet if you are looking for a safe way to do sync calls on the main queue / thread. 👌
Don't call sync on a serial queue from the serial queue's thread!
You can simply delay code execution using the Dispatch framework.
Perform concurrent loop
Dispatch queue simply allows you to perform iterations concurrently.
Oh, by the way it's just for debugging purpose, but you can return the name of the current queue by using this little extension. Do not use in production code!!! ⚠️
Using DispatchWorkItem in Swift
DispatchWorkItem encapsulates work that can be performed. A work item can be dispatched onto a DispatchQueue and within a DispatchGroup. A DispatchWorkItem can also be set as a DispatchSource event, registration, or cancel handler.
So you just like with operations by using a work item you can cancel a running task. Also work items can notify a queue when their task is completed.
Concurrent tasks with DispatchGroups
So you need to perform multiple network calls in order to construct the data required by a view controller? This is where DispatchGroup can help you. All of your long running background task can be executed concurrently, when everything is ready you'll receive a notification. Just be careful you have to use thread-safe data structures, so always modify arrays for example on the same thread! 😅
Note that you always have to balance out the enter and leave calls on the group. The dispatch group also allows us to track the completion of different work items, even if they run on different queues.
One more thing that you can use dispatch groups for: imagine that you're displaying a nicely animated loading indicator while you do some actual work. It might happens that the work is done faster than you'd expect and the indicator animation could not finish. To solve this situation you can add a small delay task so the group will wait until both of the tasks finish. 😎
A semaphore is simply a variable used to handle resource sharing in a concurrent system. It's a really powerful object, here are a few important examples in Swift.
How to make an async task to synchronous?
The answer is simple, you can use a semaphore (bonus point for timeouts)!
Lock / single access to a resource
If you want to avoid race condition you are probably going to use mutual exclusion. This could be achieved using a semaphore object, but if your object needs heavy reading capability you should consider a dispatch barrier based solution. 😜
Wait for multiple tasks to complete
Just like with dispatch groups, you can also use a semaphore object to get notified if multiple tasks are finished. You just have to wait for it...
Batch execution using a semaphore
You can create a thread pool like behavior to simulate limited resources using a dispatch semaphore. So for example if you want to download lots of images from a server you can run a batch of x every time. Quite handy. 🖐
The DispatchSource object
A dispatch source is a fundamental data type that coordinates the processing of specific low-level system events.
Signals, descriptors, processes, ports, timers and many more. Everything is handled through the dispatch source object. I really don't want to get into the details, it's quite low-level stuff. You can monitor files, ports, signals with dispatch sources. Please just read the offical Apple docs. 📄
I'd like to make only one example here using a dispatch source timer.
Thread-safety using the dispatch framework
Thread safety is an inevitable topic if it comes to multi-threaded code. In the beginning I mentioned that there is a thread pool under the hood of GCD. Every thread has a run loop object associated with it, you can even run them by hand. If you create a thread manually a run loop will be added to that thread automatically.
⚠️ You should not do this, demo purposes only, always use GCD queues!
Queue != Thread
A GCD queue is not a thread, if you run multiple async operations on a concurrent queue your code can run on any available thread that fits the needs.
Thread safety is all about avoiding messed up variable states
Imagine a mutable array in Swift. It can be modified from any thread. That's not good, because eventually the values inside of it are going to be messed up like hell if the array is not thread safe. For example multiple threads are trying to insert values to the array. What happens? If they run in parallell which element is going to be added first? Now this is why you need sometimes to create thread safe resources.
You can use a serial queue to enforce mutual exclusivity. All the tasks on the queue will run serially (in a FIFO order), only one process runs at a time and tasks have to wait for each other. One big downside of the solution is speed. 🐌
Concurrent queues using barriers
You can send a barrier task to a queue if you provide an extra flag to the async method. If a task like this arrives to the queue it'll ensure that nothing else will be executed until the barrier task have finished. To sum this up, barrier tasks are sync (points) tasks for concurrent queues. Use async barriers for writes, sync blocks for reads. 😎
This method will result in extremely fast reads in a thread safe environment. You can also use serial queues, semaphores, locks it all depends on your current situation, but it's good to know all the available options isn't it? 🤐
A few anti-patterns
You have to be very careful with deadlocks, race conditions and the readers writers problem. Usually calling the sync method on a serial queue will cause you most of the troubles. Another issue is thread safety, but we've already covered that part. 😉
The Dispatch framework (aka. GCD) is an amazing one, it has such a potential and it really takes some time to master it. The real question is that what path is going to take Apple in order to embrace concurrent programming into a whole new level? Promises or await, maybe something entirely new, let's hope that we'll see something in Swift 6.
- Tasks and threads in Swift
- Concurrency model in Swift
- Dispatch - Apple Developer Documentation
- Run Loops - Apple Developer Documentation
- Grand Central Dispatch Tutorial for Swift 3
- A deep dive into Grand Central Dispatch in Swift
- Grand Central Dispatch (GCD) and Dispatch Queues in Swift 3
- What is the difference between cores and threads of a processor?
- Creating Thread-Safe Arrays in Swift
- All about Concurrency in Swift - Part 1: The Present