FINS: Fast Image-to-Neural Surface

Wei-Teng Chu1 · Tianyi Zhang2 · Matthew Johnson-Roberson3,4 · Weiming Zhi3,4,5
1 Stanford University · 2 Aurora · 3 Carnegie Mellon University · 4 Vanderbilt University · 5 The University of Sydney

FINS reconstructs high-fidelity surfaces and signed distance fields from a single RGB image in seconds, enabling real-time robotic surface interaction.

Abstract

We present Fast Image-to-Neural Surface (FINS), a lightweight framework for reconstructing high-fidelity implicit surfaces and signed distance fields (SDFs) from a single image or sparse image set. Unlike prior neural implicit methods that require dense multi-view supervision and long optimization times, FINS converges within seconds by combining multi-resolution hash grid encoding, lightweight geometry and color heads, and approximate second-order optimization. By leveraging pre-trained 3D foundation models to lift 2D observations into 3D point clouds, FINS enables accurate and efficient SDF supervision from minimal visual input. We demonstrate superior convergence speed and reconstruction accuracy compared to state-of-the-art baselines, and validate its applicability in robotic surface following and motion planning tasks.

Motivation

Signed Distance Fields (SDFs) are widely used in robotics for collision avoidance, motion planning, and continuous surface interaction. However, existing neural implicit surface reconstruction methods typically:

FINS addresses these limitations by enabling real-time, single-image SDF reconstruction.

Method Overview

Pipeline Overview
FINS pipeline: single image → 3D foundation model → point cloud supervision → hash grid encoded implicit SDF.

Key Components

Results

Reconstruction Results Reconstruction Results
Reconstruction quality comparison. FINS achieves faster convergence and higher fidelity under identical sparse-view conditions.

FINS converges within ~10 seconds on consumer-grade GPUs and outperforms multi-view baselines in both surface accuracy and SDF quality.

Robotics Applications

Application to robot surface following and motion planning using learned SDF.

The learned SDF representation enables:

BibTeX

@misc{chu2025fins,
    title         = {Efficient Construction of Implicit Surface Models From a Single Image for Motion Generation}, 
    author        = {Wei-Teng Chu and Tianyi Zhang and Matthew Johnson-Roberson and Weiming Zhi},
    year          = {2025},
    eprint        = {2509.20681},
    archivePrefix = {arXiv},
    primaryClass  = {cs.RO},
    url           = {https://arxiv.org/abs/2509.20681}, 
}