EVALUACIÓN DE LA ARQUITECTURA U-NET PARA LA SEGMENTACIÓN SEMÁNTICA DE COBERTURAS NATURALES EN IMÁGENES SATELITALES RGB: EVALUACIÓN DE LA ARQUITECTURA U-NET PARA LA SEGMENTACIÓN SEMÁNTICA DE COBERTURAS NATURALES EN IMÁGENES SATELITALES RGB

Jonathan Enrique Ruiz Apablaza; Fredy Andrés  Cristancho Aguirre; Víctor Andrés  Martínez Ruiz

doi:10.23854/07199562.2025613.ruiz

Autores/as

Jonathan Enrique Ruiz Apablaza Instituto Geográfico Militar.
Fredy Andrés Cristancho Aguirre Instituto Geográfico Agustín Codazzi.
Víctor Andrés Martínez Ruiz Instituto Geográfico Agustín Codazzi.

DOI:

https://doi.org/10.23854/07199562.2025613.ruiz

Palabras clave:

segmentación semántica, U-Net, cartografía automatizada, imágenes satelitales, inteligencia artificial geoespacial

Resumen

Este estudio evalúa la viabilidad de aplicar redes neuronales convolucionales, específicamente la arquitectura U-Net, para la segmentación semántica de coberturas naturales en imágenes satelitales RGB del conjunto de datos DeepGlobe. La investigación se enmarca en el proyecto binacional COMIXTA entre el Instituto Geográfico Militar de Chile (IGM) y el Instituto Geográfico Agustín Codazzi de Colombia (IGAC), orientado al fortalecimiento de metodologías cartográficas basadas en inteligencia artificial. Se entrenaron dos versiones
del modelo: una sin un conjunto de validación explícito y otra empleando una estrategia de validación simple
con una división 80/20 de los datos y un mecanismo de parada anticipada (early stopping). Los resultados muestran que el modelo sin validación incurrió en sobreajuste, alcanzando métricas artificialmente elevadas
(IoU mayor a 0.83), mientras que el modelo con validación obtuvo predicciones más conservadoras pero
generalizables (IoU equivalente a 0.42). La evaluación cualitativa reveló errores sistemáticos en la clase “agua” debido al desbalance en el conjunto de datos. Se utilizaron técnicas como entrenamiento en precisión mixta,
normalización robusta y activación GELU para mejorar la eficiencia y estabilidad del aprendizaje. La
implementación fue realizada en un entorno computacional accesible (GPU NVIDIA T1000), demostrando que
estas metodologías pueden ser replicadas en instituciones públicas con recursos limitados. Este trabajo
establece una base técnica sólida para futuras extensiones hacia modelos multiclase, integración de imágenes
multiespectrales y producción cartográfica automatizada a gran escala.

Descargas

Los datos de descargas todavía no están disponibles.

Citas

Bengio, Y. (2012). Practical recommendations for gradient-based training of deep architectures. En G. Montavon, G. B. Orr, & K.-R. Müller (Eds.), Lecture notes in computer science: Vol. 7700. Neural networks: Tricks of the trade (2ª ed., pp. 437–478). Springer. https://doi.org/10.1007/978-3-642-35289-8_26

Buda, M., Maki, A., & Mazurowski, M. A. (2018). A systematic study of the class imbalance problem in convolutional neural networks. Neural Networks, 106, 249–259. https://doi.org/10.1016/j.neunet.2018.07.011

Demir, I., Koperski, K., Lindenbaum, D., Pang, G., Huang, J., Basu, S., Hughes, F., Tuia, D., & Raskar, R. (2018). DeepGlobe 2018: A challenge to parse the Earth through satellite images. En Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) (pp. 172–181). IEEE. https://doi.org/10.1109/CVPRW.2018.00031

Everingham, M., van Gool, L., Williams, C. K. I., Winn, J., & Zisserman, A. (2010). The pascal visual object classes (VOC) Challenge. International Journal of Computer Vision, 88(2), 303–338.

Gómez, C., White, J. C., & Wulder, M. A. (2016). Optical remotely sensed time series data for land cover classification: A review. ISPRS Journal of Photogrammetry and Remote Sensing, 116, 55–72. https://doi.org/10.1016/j.isprsjprs.2016.03.008

Hendrycks, D., & Gimpel, K. (2016). Gaussian Error Linear Units (GELUs). ArXiv. https://arxiv.org/abs/1606.08415

Ioffe, S., & Szegedy, C. (2015). Batch normalization: Accelerating deep network training by reducing internal covariate shift. En Proceedings of the 32nd International Conference on Machine Learning (ICML) (Vol. 37, pp. 448–456). PMLR.

Kedron, P., Frazier, A. E., Goodchild, M. F., & Li, W. (2021). Reproducibility and replicability: A new hope for quantitative geography. Annals of the American Association of Geographers, 111(5), 1271-1274. https://doi.org/10.1080/24694452.2020.1863548

Li, X., He, Y., & Chen, Z. (2020). U-Net based deep learning for deforestation detection in Amazon rainforest using Sentinel-2 imagery. Remote Sensing Letters, 11(12), 1085-1094.

Lin, T.-Y., Goyal, P., Girshick, R., He, K., & Dollár, P. (2020). Focal loss for dense object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(2), 318–327. https://doi.org/10.1109/TPAMI.2018.2858826

Loshchilov, I., & Hutter, F. (2019). Decoupled weight decay regularization. En International Conference on Learning Representations (ICLR).

Maggiori, E., Tarabalka, Y., Charpiat, G., & Alliez, P. (2017). Convolutional neural networks for large-scale remote-sensing image classification. IEEE Transactions on Geoscience and Remote Sensing, 55(2), 645–657. https://doi.org/10.1109/TGRS.2016.2612821

Micikevicius, P., Narang, S., Alben, J., Diamos, G., Elsen, E., Garcia, D., Ginsburg, B., Houston, M., Kuchaiev, O., Venkatesh, G., & Wu, H. (2018). Mixed precision training. En International Conference on Learning Representations (ICLR).

Ronneberger, O., Fischer, P., & Brox, T. (2015). U-Net: Convolutional networks for biomedical image segmentation. En N. Navab, J. Hornegger, W. M. Wells, & A. F. Frangi (Eds.), Lecture Notes in Computer Science: Vol. 9351. Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015 (pp. 234–241). Springer. https://doi.org/10.1007/978-3-319-24574-4_28

Sokolova, M., & Lapalme, G. (2009). A systematic analysis of performance measures for classification tasks. Information Processing & Management, 45(4), 427–437. https://doi.org/10.1016/j.ipm.2009.03.002

Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (2014). Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research, 15(56), 1929–1958.

Tuia, D., Volpi, M., Copa, L., & Kanevski, M. (2011). A survey of active learning algorithms for supervised remote sensing image classification. IEEE Journal of Selected Topics in Signal Processing, 10(8), 1325-1337.

Volpi, M., & Tuia, D. (2017). Fully convolutional networks for semantic segmentation of aerial images. En 2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS) (pp. 3452-3455). IEEE.

Zhang, F., Liu, C., & Wang, L. (2019). Water body segmentation in urban areas from high-resolution images using a U-Net deep network. Remote Sensing, 11(21), 2530.

Zhu, X. X., Tuia, D., Mou, L., Xia, G. S., Zhang, L., Xu, F., & Fraundorfer, F. (2017). Deep learning in remote sensing: A comprehensive review and list of resources. IEEE Geoscience and Remote Sensing Magazine, 5(4), 8–36. https://doi.org/10.1109/MGRS.2017.2762307

EVALUACIÓN DE LA ARQUITECTURA U-NET PARA LA SEGMENTACIÓN SEMÁNTICA DE COBERTURAS NATURALES EN IMÁGENES SATELITALES RGB