
Water supply networks, as critical urban infrastructure, play an essential role in ensuring stable city operations. Accurate flow forecasting is therefore of great significance for optimizing operational scheduling, reducing energy consumption, and maintaining system stability. With the strong capability of large language models (LLM) in sequence modeling and representation learning, their application to time-series forecasting has become an emerging research direction. However, a key challenge lies in the modality gap between numerical time-series data and the semantic embedding space of language models. To address this issue, this paper proposes a cross-modal alignment-based time-series foundation model for forecasting. The proposed method constructs a mapping between time-series features and the semantic embedding space, enabling effective projection of numerical sequences into a semantic domain. Furthermore, a cross-modal alignment mechanism is designed to enhance feature fusion, thereby improving the model’s ability to capture multi-scale periodic patterns, long-term temporal trends, and stochastic demand fluctuations commonly observed in water distribution systems. Experimental results demonstrate that the proposed method consistently outperforms baseline approaches across different prediction horizons in terms of mean absolute error (MAE) and mean squared erro (MSE), verifying the effectiveness and strong generalization capability of the cross-modal alignment strategy in water distribution network flow forecasting.
water supply network; flow forecasting; time-series forecasting; large language models; cross-modal alignment; time-series foundation model