Glossary

Glossary

A list of terms that have to do with radiance fields and world generation.

  • 3D Gaussian Splatting (3DGS)

    A technique for real-time radiance field rendering that represents scenes using a sparse set of 3D Gaussians. This approach enables high-quality novel-view synthesis by optimizing Gaussian parameters to match input images, facilitating efficient rendering without the need for neural networks.

  • 3D Reconstruction

    The process of capturing the shape and appearance of real objects to create digital 3D models. This can be achieved through various methods, including photogrammetry, laser scanning, and computational algorithms that interpret 2D images to infer 3D structures.

  • 3D Scanning

    A technique used to capture the physical dimensions and shape of an object or environment by collecting data on its surface. This data is then used to create accurate digital 3D models for applications in industries like manufacturing, entertainment, and cultural heritage preservation.

  • Active Stereo

    An active stereo system enhances traditional stereo vision by projecting a known pattern (structured light) onto the scene to improve correspondence matching between images. This technique is particularly useful in environments with low texture or varying lighting conditions, as the projected pattern provides additional features for accurate depth estimation.

  • Anisotropic Gaussian

    A Gaussian function with direction-dependent properties, allowing for the representation of elongated or oriented features in 3D space. In 3DGS, anisotropic Gaussians help model complex scene geometries more accurately.

  • Bundle Adjustment

    An optimization process in computer vision that refines camera parameters and 3D point positions simultaneously to minimize reprojection errors across multiple images, enhancing the accuracy of 3D reconstructions.

  • Camera Calibration

    The process of estimating a camera’s intrinsic parameters (such as focal length and optical center) and extrinsic parameters (such as rotation and translation) to accurately map 3D points in the world to 2D points in an image. This is crucial for precise 3D reconstruction and measurements.

  • Density Field

    A density field describes how dense (or solid) a point in space is within a 3D scene. This is crucial for volume rendering, as it defines how light interacts with the scene at each point. In NeRF, the density field is learned alongside the radiance field to create realistic 3D representations.

  • Depth Estimation

    The process of determining the distance of objects from a viewpoint, often using stereo vision or monocular cues. Accurate depth estimation is crucial for tasks like 3D reconstruction and scene understanding.

  • Depth Map

    An image or image channel that contains information relating to the distance of the surfaces of scene objects from a viewpoint. Depth maps are used in various applications, including 3D reconstruction, to represent the spatial structure of a scene.

  • Depth Sensor

    A device that measures the distance between the sensor and objects within its field of view, producing depth maps that represent the 3D structure of a scene. Depth sensors employ various technologies, including stereo vision, structured light, and time-of-flight measurements, and are integral components in applications like 3D scanning, robotics, and augmented reality.

  • Differentiable Rendering

    A rendering technique that allows gradients to be computed with respect to scene parameters, enabling the integration of rendering processes into neural network training for tasks like inverse rendering and 3D reconstruction.

  • Epipolar Geometry

    The geometric relationship between two views of a 3D scene, describing the intrinsic projective geometry between them. Understanding epipolar geometry simplifies the search for corresponding points between images, which is fundamental in stereo vision and 3D reconstruction.

  • Fourier Feature Mapping

    A method that transforms spatial coordinates into higher-dimensional features using sine and cosine functions, enabling neural networks to capture high-frequency details in functions like those used in NeRFs.

  • Free Viewpoint Video

    A video format that allows users to interactively change the viewing perspective of a scene, providing a more immersive experience. Techniques like NeRFs and 3DGS facilitate the creation of free viewpoint videos.

  • Gaussian Splatting

    A volume rendering technique that represents volumetric data using Gaussian functions, allowing for efficient and high-quality rendering of complex scenes without converting data into surface primitives.

  • Global Illumination

    A set of rendering techniques that simulate both direct and indirect lighting to produce realistic images by accounting for light interactions like reflections and refractions within a scene.

  • Gradient Descent

    An optimization algorithm that iteratively adjusts model parameters in the direction that most reduces the error, commonly used in training neural networks and optimizing 3D reconstructions.

  • Homography

    A transformation that maps points from one plane to another, preserving straight lines. In 3D reconstruction, homographies relate two images of the same planar surface, aiding in tasks like image stitching and object recognition.

  • Implicit Representation

    A continuous function to encode a scene’s geometry or appearance rather than discrete elements like meshes or voxels. For example, NeRF uses a neural network to implicitly model density and radiance at any point in space, allowing for high-resolution reconstructions. Implicit representations are compact, flexible, and ideal for neural scene synthesis.

  • Inverse Rendering

    The process of inferring scene properties such as geometry, materials, and lighting from observed images. By reversing the traditional rendering pipeline, inverse rendering enables the reconstruction of 3D scenes from 2D photographs, facilitating applications like augmented reality and scene understanding.

  • Latent Diffusion Models (LDM)

    A class of generative models that operate in a compressed latent space to produce high-quality images or 3D structures. By focusing on essential features, LDMs enhance computational efficiency, making them suitable for applications like text-to-image synthesis and 3D reconstruction.

  • Light Field

    A function that describes the amount of light traveling in every direction through every point in space. Capturing the light field of a scene allows for post-capture adjustments of focus and perspective, contributing to advanced 3D imaging techniques.

  • Marching Cubes

    An algorithm used to extract a polygonal mesh of an isosurface from a three-dimensional scalar field. Widely employed in medical imaging and scientific visualization, it facilitates the conversion of volumetric data into a surface representation.

  • Mesh Reconstruction

    The process of creating a polygonal mesh that represents the surface of a 3D object or scene. This involves connecting points (vertices) with edges and faces to form a continuous surface, essential for rendering and analysis in computer graphics and 3D modeling.

  • Mip-NeRF

    An extension of Neural Radiance Fields that incorporates multiscale representations to address aliasing issues. By modeling scenes at various levels of detail, Mip-NeRF improves rendering quality, especially when dealing with zoomed-out views or distant objects.

  • Multi-View Stereo (MVS)

    A technique in computer vision that reconstructs 3D structures from multiple images taken from different viewpoints. By identifying corresponding points across images, MVS estimates depth information, enabling the creation of detailed 3D models.

  • Multi-View Reconstruction

    The process of recovering the 3D structure of a scene from multiple 2D images taken from different viewpoints. Techniques like structure-from-motion (SfM) and multi-view stereo (MVS) are used to align the images and extract depth and geometry. NeRF extends this idea using neural networks for more detailed reconstructions.

  • Neural Radiance Fields (NeRF)

    A method that uses neural networks to represent a scene’s volumetric radiance and density. By learning this representation from a set of images, NeRF enables the synthesis of novel views, contributing to advancements in 3D reconstruction and virtual reality.

  • Neural Rendering

    The application of neural networks to generate or enhance images, often by learning complex scene representations. Neural rendering techniques enable tasks like novel view synthesis, image-based relighting, and the creation of photorealistic graphics.

  • Novel View Synthesis

    The process of generating new images of a scene from viewpoints that were not included in the input data. NeRF excels at this task by learning the underlying 3D scene representation and rendering high-quality views from arbitrary angles.

  • Normal Mapping

    A technique in 3D graphics to simulate the lighting of bumps and dents on a surface without increasing the polygon count. By altering the surface normals, normal mapping adds detailed texture to 3D models, enhancing realism in rendering.

  • Occupancy Grid Mapping

    A method used in robotics and 3D reconstruction to represent the environment as a grid, where each cell indicates the probability of occupancy. This probabilistic approach aids in building maps of unknown environments and is fundamental in Simultaneous Localization and Mapping (SLAM).

  • Opacity

    The amount of light that is blocked or absorbed by a point in a scene. In volume rendering, opacity is computed based on the density at sampled points and determines how much each point contributes to the final color of the pixel. High-opacity regions are typically solid, while low-opacity regions are transparent or semi-transparent.

  • Overfitting

    This occurs when a model learns to replicate the training data too precisely, failing to generalize to new viewpoints or scenes. In the context of NeRF, overfitting might result in perfect reconstructions of training views but poor performance on novel views. Regularization techniques are often used to mitigate overfitting.

  • Photogrammetry

    The science of making measurements from photographs, especially for recovering the exact positions of surface points. In 3D reconstruction, photogrammetry is used to create models of real-world objects or environments by analyzing overlapping photographs taken from different angles.

  • Photometric Stereo

    A technique for estimating the surface normals of objects by observing them under different lighting conditions. This method captures fine surface details, beneficial for accurate 3D reconstruction.

  • Photon Mapping

    A global illumination technique used in traditional rendering to approximate the behavior of light in a scene by tracing photons as they bounce off surfaces. While different from neural methods, it shares the goal of simulating realistic light transport.

  • Plenoptic Function

    A function that describes the intensity of light in a scene as a function of position, direction, wavelength, and time. Understanding the plenoptic function is fundamental in computational imaging and forms the basis for technologies like light field cameras.

  • Point Cloud

    A collection of data points defined in a three-dimensional coordinate system, representing the external surface of objects or scenes. Point clouds are often generated by 3D scanners and are used in various applications, including 3D modeling and reconstruction.

  • Point-Based Rendering

    A rendering technique that uses points as the basic rendering primitive instead of polygons. This approach is particularly useful for rendering complex 3D models derived from point cloud data, such as those obtained from 3D scanning.

  • Positional Encoding

    A technique used to map low-dimensional input coordinates (e.g., 3D positions) to a higher-dimensional space to allow neural networks to learn high-frequency details. In NeRF, positional encoding transforms input positions and viewing directions into a more expressive feature space, enabling the model to represent fine details like sharp edges and texture. It’s critical for overcoming the limitations of standard neural networks.

  • Radiance Field

    A mathematical representation that describes the distribution of light (color and brightness) in a 3D space, including how it varies depending on the direction. It captures both the geometry and appearance of a scene by encoding spatial and directional dependencies of light. Popular Radiance Field representations include 3D Gaussian Splatting and NeRFs.

  • Ray Marching

    A rendering technique that traces rays through a scene to compute color and lighting by accumulating information along the ray’s path. Ray marching is commonly used in volume rendering and is integral to techniques like NeRFs.

  • Ray Tracing

    A rendering technique that simulates the way light interacts with objects to produce realistic images. Ray tracing calculates the paths of rays of light as they travel through a scene, allowing for the simulation of effects like reflections, refractions, and shadows.

  • Ray Sampling

    The process of tracing rays from a camera through a scene to gather information about the light and geometry at sampled points. These sampled points are used to compute the final rendered image. Efficient and accurate ray sampling is essential for training radiance fields and synthesizing novel views.

  • Regularization

    Methods used to constrain a model’s complexity and encourage it to generalize beyond the training data. Common techniques in radiance fields include sparsity constraints on density and smoothness priors. These help prevent overfitting and produce more realistic reconstructions.

  • Rendering

    The process of generating an image from a model by means of computer programs. Rendering is used in various applications, including video games, simulations, and movies, to visualize 3D models in a realistic or stylized manner.

  • Rendering Equation

    A mathematical model describing how light is emitted, absorbed, and scattered within a scene. It serves as the foundation for physically-based rendering methods, including those used in radiance fields. Solving the rendering equation enables realistic image synthesis.

  • Scene Optimization

    The process of adjusting the parameters of a radiance field model to minimize the error between predicted and actual images. This iterative process ensures that the model accurately learns the geometry and appearance of the scene.

  • Scene Representation

    The method used to encode the geometry, texture, and light properties of a 3D scene. Representations can be explicit (e.g., meshes, point clouds) or implicit (e.g., neural networks like NeRF). Implicit representations are compact and continuous, making them ideal for novel view synthesis.

  • Scene Representation Networks (SRNs)

    Neural networks designed to learn implicit representations of 3D scenes from images, enabling tasks like novel view synthesis and 3D reconstruction without explicit 3D supervision.

  • Shape-from-Shading

    A method that infers the 3D shape of a surface from the shading information in a single image. By analyzing the variations in brightness caused by lighting, this technique estimates the surface normals and reconstructs the object’s geometry, assuming known reflectance properties and lighting conditions.

  • Signed Distance Function (SDF)

    A function that returns the shortest distance from any point in space to the surface of a shape, with the sign indicating whether the point is inside or outside the shape. SDFs are used in 3D modeling and computer graphics for efficient surface representation.

  • Simultaneous Localization and Mapping (SLAM)

    A computational problem of constructing or updating a map of an unknown environment while simultaneously keeping track of an agent’s location within it. SLAM is essential in robotics and augmented reality for navigation and environment interaction.

  • Sparse Neural Representations

    Neural network models that use sparse data structures to efficiently represent complex scenes or objects, reducing computational requirements while maintaining high-quality reconstructions.

  • Splatting

    A volume rendering technique that projects volumetric data onto the image plane by distributing energy (or color) from data points onto nearby pixels, often using Gaussian kernels. Splatting is used for efficient rendering of volumetric data without explicit surface extraction.

  • Structure from Motion (SfM)

    A photogrammetric technique that reconstructs 3D structures from 2D image sequences by analyzing the motion of feature points across images. SfM is widely used in 3D reconstruction and computer vision to create 3D models from photographs.

  • Structured Light Scanning

    A 3D scanning technique that projects a known pattern of light (such as stripes or grids) onto an object and captures the deformation of this pattern with cameras. By analyzing these deformations, the system reconstructs the object’s surface geometry with high accuracy, making it suitable for applications in industrial inspection, reverse engineering, and biometrics.

  • Surface Reconstruction

    The process of creating a continuous surface from discrete data points, such as those obtained from 3D scanning or photogrammetry. Surface reconstruction is vital for generating usable 3D models from raw data, enabling applications in visualization, analysis, and manufacturing.

  • Texture Mapping

    The method of applying images (textures) to 3D models to add color, detail, and realism. Texture mapping enhances the visual appearance of 3D reconstructions by providing surface details without increasing geometric complexity.

  • Time-of-Flight (ToF) Camera

    A depth-sensing device that measures the time it takes for emitted light (usually infrared) to travel to an object and back to the sensor. By calculating this time delay, the camera determines the distance to various points in the scene, producing a depth map useful in applications like gesture recognition, robotics, and 3D mapping.

  • Training Data

    Radiance fields typically consist of images captured from multiple viewpoints, along with their corresponding camera poses. High-quality, diverse training data ensures the model learns an accurate representation of the scene. Noise or insufficient data can lead to poor reconstructions.

  • Triangulation

    A geometric method used to determine the location of a point by measuring angles from known points at either end of a fixed baseline. In 3D scanning, triangulation is employed by systems like stereo cameras and structured light scanners to calculate depth information based on the convergence of lines of sight from different viewpoints.

  • View Dependency

    How an object’s appearance (e.g., its shading or reflections) changes depending on the observer’s viewing angle. NeRF explicitly models view dependency to capture effects like specular highlights, making rendered images more realistic. This is achieved by incorporating viewing direction as an input to the neural network.

  • Visual Hull

    A geometric representation of an object obtained by intersecting the silhouettes captured from multiple camera viewpoints. The visual hull provides an approximate 3D shape of the object, which can be refined with additional data for more accurate reconstruction.

  • Volume / Volumetric Rendering

    A set of techniques used to display three-dimensional volumetric data, allowing for the visualization of structures without explicit surface representation. Volumetric rendering is essential in medical imaging and scientific visualization.

  • Voxel

    A voxel, short for “volume element,” is the three-dimensional equivalent of a pixel. It represents a value on a regular grid in 3D space and is commonly used in volumetric data representation, such as in medical imaging and 3D modeling.

  • Voxel Grid

    A volumetric representation of 3D space that divides the space into uniform cubes called voxels (volume elements). Each voxel contains information about the presence or absence of material at that location, enabling the modeling of complex structures and environments in applications like medical imaging and 3D modeling.

  • Warping

    A process in image processing and computer graphics that involves deforming or mapping one image to align with another. Warping is used in applications such as texture mapping, image registration, and morphing, where precise alignment of images or textures is required for accurate representation or analysis.

  • Z-Buffering

    Z-buffering, also known as depth buffering, is a computer graphics technique that manages image depth coordinates in 3D graphics to handle occlusion. It keeps track of the depth of every pixel on the screen to determine which objects are in front and should be visible.

  • Zero-Shot Learning

    Zero-shot learning is a machine learning paradigm where a model is capable of recognizing objects or performing tasks without having seen any previous examples during training. In 3D reconstruction, this can involve generating models of objects from categories that were not present in the training data.