This page outlines the weekly schedule for lectures, labs, assignments, and examinations. The schedule will be updated regularly to align with the University of Juba's academic calendar and holiday schedule. Reading materials, lecture slides, and lab materials will be accessible through this schedule, with links provided for downloading prior to the commencement of each lecture or lab session. If you encounter any difficulties or have questions, please contact the lead Teaching Fellow, Thiong Abraham.
Ah, building right off our fascinating dive into instance segmentation last week, this session on pose estimation delves into the exciting challenge of not just identifying and outlining individual objects, but also pinpointing the precise location of their key body parts or landmarks. We'll explore how computer vision techniques can be used to understand the articulated structure of objects, be it humans, animals, or even tools, opening doors to applications ranging from human-computer interaction and sports analysis to robotics and beyond. To revisit how we first learned to distinguish and segment individual objects before tackling their internal structure, you can find the materials from last week's instance segmentation lecture in Week 8 below.
This lecture dives into the fascinating realm of instance segmentation, building directly upon our understanding of semantic segmentation from last week. While semantic segmentation classifies each pixel into predefined categories, instance segmentation takes it a step further by not only identifying the category of each pixel but also distinguishing between individual instances of the same object. We'll explore cutting-edge architectures and methodologies that enable computers to not just see different objects but also to delineate and count each specific object within an image, opening doors to more granular and sophisticated scene understanding.
This week we dive into image segmentation, a core computer vision task where the goal is to assign a label to every pixel in an image, effectively grouping them into meaningful regions. Unlike simpler tasks like image classification or object detection, segmentation provides a much richer understanding of the scene by delineating the precise boundaries of different objects and regions. We'll explore various approaches, from classical techniques based on color and texture to modern deep learning architectures like U-Nets, and discuss their applications in areas like autonomous driving, medical imaging, and robotic perception, building upon the foundational computer vision concepts we've covered this semester.
Building on our previous lecture on object detection, where we learned to identify objects within an image, we now delve into the fascinating world of object tracking. While object detection provides a snapshot of what is in an image, tracking allows us to understand how these objects move and evolve over time in a video sequence. This lecture will cover the fundamental concepts and challenges of object tracking, different categories of tracking algorithms—including kernel-based, evaluation metrics for tracking performance, and real-world applications, such as surveillance and security, autonomous driving, and human-computer interaction.
This session will focus on course announcements and knowledge sharing presentation
This lecture introduces object detection, a computer vision task that extends beyond image classification by not only identifying the objects present in an image but also locating their spatial extent through bounding boxes. Building upon the foundational concepts of image classification, where a model learns to assign a single label to an entire image, object detection models learn to simultaneously classify and localize multiple objects. We'll explore techniques like sliding windows, region proposal networks, and modern architectures such as YOLO and Faster R-CNN, detailing how these methods utilize convolutional neural networks to extract features, predict object classes, and refine bounding box coordinates. The lecture will also cover evaluation metrics like mean Average Precision (mAP) and practical applications of object detection, from autonomous driving to industrial automation.
This lecture introduces Vision Transformers (ViTs), a cutting-edge approach to image classification that leverages the power of transformer architectures, originally developed for natural language processing, to analyze visual data. The session will cover the fundamental principles of Vision Transformers, including their architecture, self-attention mechanisms, and how they differ from traditional convolutional neural networks (CNNs). We will explore the advantages of ViTs, such as their ability to capture global context and their scalability to large datasets, as well as their challenges, including computational requirements and data efficiency.
This lecture provides a comprehensive introduction to image classification, a core computer vision task, beginning with an overview of its fundamental principles and diverse applications. The session delves into the algorithmic foundations, highlighting the pivotal role of Convolutional Neural Networks (CNNs) in learning hierarchical image features. Various CNN architectures, including ResNet, VGG, and Inception, are explored, emphasizing their unique strengths and design considerations. The lecture culminates in a hands-on lab exercise where participants apply their newfound knowledge to classify African wildlife images, utilizing a provided dataset and pre-trained models. This practical component allows students to solidify their understanding of CNNs and their application in real-world scenarios, specifically focusing on the identification of species like buffalo, elephant, rhino, and zebra, thereby bridging theoretical concepts with practical implementation.
This module introduces the course materials to students and gives an overview of Computer Vision, its tasks, applications, and the state of the art.