Volume 6 - Year 2025 - Pages 73-82
DOI: 10.11159/jmids.2025.007
Unified Metaheuristic Channel Selection Framework for Medical Image Segmentation
Alireza Norouziazad1,2, Fatemeh Esmaeildoost1,3, Razieh Salahandish 1,2*
1Laboratory of Advanced Biotechnologies for Health Assessments (Lab-HA), Lassonde
School of Engineering, York University, Toronto, ON, M3J 1P3 Canada
2Department of Electrical Engineering and Computer Science (EECS), Lassonde School of Engineering, York University, Toronto, ON, M3J 1P3, Canada
3Department of Mechanical Engineering, Lassonde School of Engineering, York University,
Toronto, ON, M3J 1P3, Canada
norouzi@yorku.ca; nadiet@yorku.ca
*Corresponding author: raziehs@yorku.ca
Abstract - Accurate segmentation of disease biomarkers, such as hippocampal atrophy in Alzheimer’s disease (AD) and tumor regions in breast ultrasound, is essential for early diagnosis and treatment planning. To address feature redundancy and poor interpretability in deep segmentation models, we propose a dynamic, training-time, fuzzy-adaptive channel selection framework guided by moth-flame (MFO) and whale (WOA) optimization algorithms. Our method enhances multi-scale feature representation by fitness-driven selection of discriminative channels without modifying network architecture or requiring retraining. Evaluated on clinically curated AD MRI and breast ultrasound datasets, our approach consistently improves segmentation performance: on hippocampal segmentation, it achieves 99.7% sensitivity and 85.4% Dice score, surpassing baselines by 0.6% and 5.3%, respectively. The fitness-guided mechanism also provides interpretable feature selection, aligning with clinical demands for trustworthy AI.
Keywords: Medical image segmentation, Alzheimer’s disease, breast cancer, metaheuristic optimization, fuzzy logic, channel selection, U-Net, DeepLabV3+, interpretability.
© Copyright 2025 Authors - This is an Open Access article published under the Creative Commons Attribution License terms (http://creativecommons.org/licenses/by/3.0).
Date Received: 2025-01-23
Date Revised: 2025-09-29
Date Accepted: 2025-10-09
Date Published: 2025-12-09
1. Introduction
In Alzheimer’s disease (AD), hippocampal atrophy is a critical early biomarker: even subtle underestimation of its volume can lead to missed diagnosis, inaccurate prognosis, or flawed assessment of therapeutic response (1-4). Similarly, in breast cancer, precise segmentation of tumor boundaries in ultrasound is essential for surgical planning, margin evaluation, and monitoring treatment efficacy, yet remains highly challenging due to speckle noise, irregular shapes, and low contrast (5).
Medical image segmentation thus serves as a cornerstone in modern healthcare, enabling early diagnosis, treatment planning, and longitudinal monitoring (1, 2). While MRI and other modalities offer high soft-tissue contrast, automated segmentation remains difficult due to lesion heterogeneity, imaging artifacts, and domain variability across scanners and institutions (5).
Deep learning, particularly CNNs like U-Net, V-Net, and DeepLabV3+, has significantly advanced segmentation by learning complex feature hierarchies, reducing reliance on manual annotation and accelerating clinical workflows (6, 7). However, key barriers persist: redundant feature channels dilute discriminative signals, domain shift limits generalizability, and black-box behavior undermines clinical trust. Although attention mechanisms, explainable AI (XAI), and domain adaptation offer partial remedies, they often fail to dynamically suppress irrelevant features or provide transparent, interpretable decisions (8).
To address these gaps, we propose a unified, fuzzy logic–guided metaheuristic framework for dynamic channel selection and feature refinement. Our approach integrates moth-flame optimization (MFO) (9) and whale optimization algorithm (WOA)(10) with both U-Net and DeepLabV3+, enabling adaptive emphasis on discriminative channels while suppressing redundant ones. Fuzzy inference further stabilizes the search process, balancing exploration and exploitation during training (11). Crucially, unlike static or attention-based methods, our framework adapts channel importance in real time across diverse models and datasets, enhancing both accuracy and interpretability, without architectural changes or retraining.
The key contributions of this paper are summarized as follows:
- We propose the dynamic, training-time, fitness-driven, fuzzy-adaptive channel selection framework using MFO and WOA for medical image segmentation, operating within fixed architectures without retraining or structural modification, enabling interpretable, lightweight enhancement of U-Net and DeepLabV3+.
- We demonstrate robustness and generalizability by evaluating the framework on clinically curated datasets of AD and breast cancer imaging.
- The optimization process dynamically emphasizes task-relevant features while reducing redundancy, improving accuracy without compromising computational efficiency.
- Extensive experiments confirm consistent performance gains across all architecture–optimizer–dataset combinations, achieving higher Dice scores and sensitivity compared to baseline methods.
2. Related Work
2. 1. Deep Learning Architectures in Medical Image Segmentation
Medical image segmentation has experienced transformative advancement with the emergence of deep learning architectures, particularly CNNs. U-Net (12) has established itself as a cornerstone architecture for medical image segmentation tasks, demonstrating exceptional performance across various modalities, including CT, MRI, and X-ray imaging. The architecture's unique encoder-decoder structure with skip connections enables effective feature extraction while preserving spatial information crucial for precise boundary delineation (13, 14).
DeepLabV3+ (15), another prominent architecture, has gained significant attention for its ability to capture multi-scale contextual information through atrous spatial pyramid pooling and encoder-decoder mechanisms. Recent comparative studies have shown that DeepLabV3+ with ResNet101 backbone achieves superior performance in several evaluation metrics compared to traditional U-Net variants, particularly excelling in capturing detailed boundary information. The architecture's synthesis of DeepLab series advantages with the encoder-decoder structure makes it particularly suitable for complex medical segmentation tasks (14, 16-21).
Beyond traditional CNN architectures, transformer-based models have emerged as powerful alternatives. TransUNet (22), which combines vision transformer (ViT) with U-Net architecture, has demonstrated superior performance across multiple medical image segmentation datasets. The integration of self-attention mechanisms enables these models to capture long-range dependencies and global context, addressing limitations of purely convolutional approaches (23, 24).
2. 2. Attention Mechanisms and Channel Selection
The incorporation of attention mechanisms has proven instrumental in enhancing segmentation performance by enabling models to focus on discriminative features while suppressing irrelevant information. Comprehensive attention-based networks, such as CA-Net (25), have demonstrated the effectiveness of simultaneously utilizing spatial channel and scale attention mechanisms (26), achieving significant improvements in Dice scores across various medical segmentation tasks. The spatial attention modules help networks focus on foreground regions, while channel attention mechanisms adaptively recalibrate feature responses to highlight task-relevant channels (27-29).
Recent developments in channel attention have focused on frequency domain representations and multi-scale feature enhancement. FCAU-Net extends channel attention to the frequency domain (30), combining multi-spectral attention modules with U-Net for improved feature representation. Similarly, EMCAD introduces efficient multi-scale convolutional attention decoding that achieves state-of-the-art performance with significant parameter reduction while maintaining superior accuracy (31, 32).
The integration of dual attention mechanisms, combining both spatial and channel attention, has shown particular promise in medical imaging applications. DA-TransUNet successfully integrates spatial and channel dual attention with transformer architectures, demonstrating consistent improvements across five medical image segmentation datasets (33). These approaches address the challenge of emphasizing task-relevant features while managing the computational complexity inherent in medical image analysis (34, 35).
2. 3. Metaheuristic Optimization in Medical Image Analysis
Metaheuristic optimization algorithms have emerged as powerful tools for feature selection and parameter optimization in medical image analysis. MFO has demonstrated particular effectiveness in medical applications, with binary variants achieving superior performance in feature selection tasks across multiple medical datasets. Recent studies have shown that enhanced MFO algorithms with Levy flight operators and chaotic mapping can significantly improve classification accuracy while reducing feature dimensionality (36-38).
The WOA has been successfully integrated with deep learning frameworks for medical image classification and segmentation. WOA-optimized CNN architectures have achieved remarkable results in breast cancer detection. The algorithm's ability to optimize CNN parameters, including kernel size, feature map count, and pooling type, has proven particularly valuable for medical imaging applications where parameter sensitivity is critical (39-41).
Hybrid metaheuristic approaches combining multiple optimization strategies have shown enhanced performance in medical feature selection. Chaotic MFO variants have demonstrated superior performance in AD detection, outperforming ant colony optimization and firefly algorithms in terms of feature reduction and classification accuracy (37, 38, 42).
2. 4. Fuzzy Logic Integration in Medical Segmentation
Fuzzy logic approaches have gained prominence in medical image analysis due to their ability to handle inherent uncertainties and ambiguities present in medical imagery. Recent developments in fuzzy-based medical image processing encompass diagnostic decision support, image enhancement, segmentation, and feature extraction, with fuzzy C-means clustering and fuzzy decision trees emerging as particularly effective techniques (20, 43, 44).
The integration of fuzzy logic with deep learning architectures has shown promising results in addressing boundary uncertainty and feature ambiguity. F2CAU-Net, a dual fuzzy cascade segmentation network (44, 45), incorporates fuzzy convolutional modules to enhance uncertain feature representation and fuzzy attention mechanisms to suppress redundancy in multi-resolution feature fusion. This approach specifically targets the challenges of uncertainty, ambiguity, and boundary fuzziness inherent in medical images (46).
Fuzzy attention mechanisms have been developed to improve feature selection and reduce redundancy in skip connections of encoder-decoder architectures. These mechanisms employ fuzzy functions to suppress irrelevant information while emphasizing clinically relevant regions, thereby improving segmentation performance in complex medical scenarios. The mathematical framework provided by fuzzy logic enables more nuanced handling of imprecise and vague data compared to traditional binary approaches (47).
2. 5. Comparison with neural architecture search (NAS) and Pruning Methods
While NAS and network pruning have emerged as powerful paradigms for optimizing deep learning models in computer vision, their application in medical image segmentation remains limited due to computational complexity and a lack of dynamic adaptability (48). NAS frameworks, such as Auto-DeepLab and DARTS, perform architecture-level optimization by searching over vast design spaces, often requiring thousands of GPU hours and large-scale labeled datasets, which are rarely available in clinical settings. Similarly, pruning-based approaches, including magnitude-based weight pruning and structured channel pruning, permanently remove parameters or channels based on static importance scores computed at a single training stage. These methods lack the ability to adaptively respond to evolving feature relevance during training and may inadvertently discard clinically meaningful pathways (49, 50).
In contrast, our fuzzy logic-guided metaheuristic framework performs dynamic, fitness-driven channel selection without modifying the underlying network architecture or requiring retraining from scratch. Channel importance is continuously re-evaluated at each training iteration based on activation magnitude and population diversity, enabling context-aware refinement tailored to the segmentation task. This approach preserves model stability while reducing redundancy, a critical advantage in medical applications where interpretability, reproducibility, and deployment efficiency are paramount.
3. Methodology
3. 1. Overview
This study proposes a generalized framework for adaptive feature selection in deep medical image segmentation networks. The framework is applied to both DeepLabV3+ and U-Net:
DeepLabV3+: the dynamic feature selection operates within the ASPP module, which captures multi-scale context.
U-Net: the optimization is applied to the encoder convolutional layers, responsible for hierarchical feature extraction.
In both architectures, fuzzy logic-guided metaheuristic optimization (MFO or WOA) is used to identify the most discriminative channels at each layer, suppress redundant channels, and enhance both fine-grained boundary delineation and global contextual representation. This strategy allows the same mechanism to generalize across datasets, tumor types, and backbone architectures.
3. 2. Feature Selection with Fuzzy-MFO/WOA
3. 2. 1. Channel Fitness
The importance of each channel is quantified using a fitness function that computes the mean absolute activation across its spatial dimensions, averaged over the current training batch. Let denotes the activation map of the -th channel, where and are the height and width of the feature map, respectively. The fitness of channel is defined as:
, where B is the batch size, and
is the activation at spatial location (j,k) in channel i for the b-th sample in the batch.
To ensure fair comparison across layers with different activation scales, we optionally normalize fitness values per layer by dividing by the maximum activation magnitude observed in that layer during the current epoch. Channels with higher fitness values are assumed to contain more task-relevant information (e.g., tumor boundaries or structural textures). The optimal channel is then selected as:
, where C is the total number of channels. This ensures that the most informative channels are prioritized for subsequent refinement.
3. 2. 2. Population Diversity
To maintain robust exploration and avoid premature convergence, we define population diversity as the standard deviation of channel fitness values:
, where n is the number of channels and μF is the mean fitness across all channels.
A high diversity indicates that channels vary significantly in importance, signaling that exploration should continue; low diversity suggests that most channels are similar, so exploitation can be emphasized.
3. 2. 3. Fuzzy Adaptation
We employ a fuzzy inference system (FIS) to dynamically adjust the metaheuristic’s convergence parameter 𝑎, balancing exploration and exploitation:
ak is the output from fuzzy rule 𝑘, and wk is its weight.
Typical rules:
- IF Iteration progress is low AND diversity is high, THEN decay 𝑎 slowly → favors exploration.
- IF Iteration progress is high AND diversity is low, THEN decay 𝑎 rapidly → favors exploitation.
This adaptive control allows the algorithm to focus on different feature selection strategies depending on the current stage of training.
To ensure reproducibility and precise control, the fuzzy system is formally defined as follows: The fuzzy inference system uses two input variables: Iteration Progress (normalized epoch count ∈ [0,1]) and Population Diversity (normalized standard deviation ∈ [0,1]). Each input is mapped to two linguistic terms (“Low”, “High”) using S-shaped (for High) and Z-shaped (for Low) membership functions. The output variable, Decay Rate (a), is mapped to three terms: “Slow” (0.1–0.4), “Medium” (0.4–0.7), “Fast” (0.7–1.0). Defuzzification is performed using the centroid method. Four rules govern the system (e.g., “IF Progress=Low AND Diversity=High → Decay=Slow”). All variables are min-max normalized per epoch.
3. 3. MFO
MFO is a bio-inspired metaheuristic where moths represent candidate channels and flames represent the best channels found so far. The position of each moth is updated in a spiral pattern towards the flame:
where Fj is the flame (best channel) position, Di=|Fj-Mi | is the distance to the flame, b=0.5 controls the spiral’s shape, and t∈[-1,1] introduces randomness. The flame counted at iteration t adapts as:
, balancing exploration and exploitation during channel selection.
3. 4. WOA
WOA is a nature-inspired metaheuristic algorithm modeled after the hunting behavior of humpback whales, particularly their bubble-net feeding strategy. In the context of feature selection for deep neural networks, WOA provides a robust method for dynamically identifying the most discriminative channels while suppressing redundant ones. The algorithm operates through three complementary mechanisms, each contributing to a balance between exploration and exploitation.
Encircling prey: This mechanism guides candidate solutions (whales) toward the best-known solution (the optimal channel) at the current iteration. It is mathematically described as:

, where
represents the position of the current best channel set,
is the position of a candidate channel set at iteration ,
and
are coefficient vectors that control the step size and direction of the update, and
is the distance between the whale and the best solution.
Bubble-net spiral attack: The bubble-net mechanism models the spiral movement of whales around their prey. This allows fine-grained refinement of the candidate solutions, promoting local exploitation of the feature space:
, where b is a constant controlling the spiral shape, 𝑙 is a random number in [−1,1], introducing stochasticity and
represents the distance vector modified for spiral movement.
Exploration Phase: To prevent premature convergence and ensure that a diverse set of channels is considered, WOA incorporates a random search mechanism:

Here,
is the position of a randomly selected channel set.
These mechanisms collectively balance exploration and exploitation, ensuring robust channel selection while preventing overfitting or redundancy.
3. 5. Feature Refinement and Aggregation
Selected channels are refined using 3×3 convolutional layers:
, where σ is the ReLU activation function.
Outputs from all relevant branches (DeepLabV3+: ASPP branches, 1×1, pooled context; U-Net: encoder layers) are concatenated:
The concatenated features are compressed using a 1×1 convolution:
This process ensures efficient integration of selected channels while maintaining spatial resolution.
3. 6. Objective Function
Training optimizes a combination of segmentation loss and channel fitness reward:
, where P is the set of selected channels, λ balances the terms, and F(Cc ) is the fitness of channel c. Maximizing F(P), ensured the selection of channels with high activations and discriminative power.
3. 7. Implementation Efficiency
Channel selection occurs only during training, minimizing computational overhead during inference. The fusion of fuzzy logic with MFO/WOA improves convergence and stabilizes selection of informative features. The framework generalizes across architectures, allowing consistent performance improvements in both DeepLabV3+ and U-Net.
4. Results and Discussion
4. 1. Datasets and Preprocessing
The first dataset comprises 135 T1-weighted MRI scans from clinically diagnosed AD patients, annotated by experienced neuroradiologists. Out of these, 100 scans were used for training and 35 for testing [22]. Patient inclusion criteria required an MMSE score ≤ 24 and hippocampal atrophy confirmed by volumetric assessment. Preprocessing included z-score normalization, intensity scaling to [0,1], and volume resampling to 1 mm³ isotropic resolution, followed by slicing into 256×224 axial slices. To simulate MRI artifacts and improve robustness, Gaussian noise (σ=0.05) was injected. The dataset was divided into training (80 subjects, 1,920 slices), validation (20 subjects, 480 slices), and test (35 subjects, 840 slices) subsets.
The second dataset is a breast ultrasound collection, widely used for classification, detection, and segmentation tasks in oncology imaging. It contains images from 600 female patients aged 25–75 years, collected in 2018. The dataset comprises 780 ultrasound images (resolution: 500×500 pixels, PNG format) along with ground truth annotations. Each image is categorized into one of three classes: normal, benign, or malignant. The availability of pixel-level masks makes this dataset well-suited for segmentation, enabling accurate tumor localization.
Both datasets highlight the challenges of medical imaging segmentation: MRI hippocampal boundaries often suffer from low contrast and tissue inhomogeneity, whereas ultrasound introduces speckle noise and variable image quality. Our framework is designed to handle both scenarios by adaptively suppressing noisy channels while retaining discriminative features.
4. 2. Implementation Details
The training was performed on two NVIDIA Tesla 4 GPUs, with the model being optimized using the Adam optimizer with an initial learning rate of 0.001 and a learning rate decay factor of 0.98 using a cosine decay scheduler. All models were trained for 100 epochs. Due to the limited memory of the GPU, the batch size was set to 16.
4. 3. Studies on Ablation
To validate the contribution of each component, experiments were conducted between the baseline DeepLabV3+, a variant without MFO (without fuzzy logic), and the proposed approach with Fuzzy-MFO and an enhanced ASPP module. The baseline model achieved a Dice score of 80.1% and an IoU of 73.11%. Adding MFO without fuzzy logic improved these metrics to a Dice score of 83.4% (improvement of 3.3%) and IoU of 75.1%. The proposed method improved further, reaching a Dice score of 85.4% (an improvement of 5.3% over the baseline) and an IoU of 77.8%. Notably, the addition of fuzzy logic enhanced the Dice score and IoU by 2% and 2.6% over the vanilla MFO by mitigating overfitting to noisy channels, respectively (p < 0.01, paired t-test) (Figure 1).
4. 4. Comparison with State-of-the-Art
To validate the effectiveness of our proposed fuzzy-optimization framework, we evaluate its performance on two medical imaging datasets: AD MRI and breast ultrasound. We compare our optimized variants, U-Net + MFO, U-Net + WOA, DeepLabV3+ + MFO, and DeepLabV3+ + WOA, not only against their respective non-optimized baselines (U-Net and DeepLabV3+) but also against state-of-the-art attention-based segmentation models, namely Attention U-Net and DA-TransUNet. Performance is assessed using Dice similarity coefficient (DSC), mean intersection-over-union (mIoU), sensitivity, and false discovery rate (FDR). Quantitative results are summarized in Table 1.
On the AD MRI dataset, our proposed method (DeepLabV3+ + Fuzzy-MFO) demonstrated consistent superiority across all evaluation metrics. It achieved the highest Dice score of 85.4%, outperforming Attention U-Net (81.22%) by 4.18% and DA-TransUNet (82.90%) by 2.5%, in addition to surpassing baseline U-Net by 6.8% and baseline DeepLabV3+ by 5.3%. For region-overlap accuracy, it reached an mIoU of 77.8%, representing a substantial gain over DA-TransUNet (75.60%) and Attention U-Net (74.31%). Sensitivity was highest at 99.7%, indicating exceptional recall of hippocampal voxels, critical for avoiding underestimation in clinical volumetric assessments. The false discovery rate decreased to 21.2%, compared to 27.0% in the DeepLabV3+ baseline and 24.5–25.8% in attention-based models, demonstrating more precise segmentation with fewer false positives. These gains confirm that our framework enhances both global structure preservation and fine boundary delineation beyond what even advanced attention mechanisms can achieve.
On the breast ultrasound dataset, characterized by speckle noise, irregular tumor boundaries, and heterogeneous textures, our U-Net + WOA variant achieved the highest Dice score of 78.80%, outperforming DA-TransUNet (78.10%) and Attention U-Net (76.87%) by 0.7% and 1.93%, respectively. It also surpassed baseline U-Net (75.75% Dice, 62.18% mIoU) by +3.05% and +4.09%, respectively. Our DeepLabV3+ + WOA} variant closely followed with 78.68% Dice} and 66.98% mIoU, surpassing its baseline (76.08% Dice, 63.48% mIoU) by +2.6% and +3.5%. Sensitivity for both top variants exceeded 98.5%, ensuring robust tumor detection, while FDR dropped to 23.0–24.0%, indicating improved localization precision over DA-TransUNet (24.8%) and Attention U-Net (26.3%). The consistent advantage over these strong attention-based baselines highlights a key strength of our approach: dynamic, fitness-driven channel selection via metaheuristic optimization provides a more adaptive and context-aware mechanism for suppressing redundancy and enhancing discriminative features than static or transformer-based attention alone.
Table 1. Comparison of segmentation results.
|
Model |
Dice (%) |
mIoU (%) |
Sensitivity (%) |
FDR (%) |
|
AD MRI |
||||
|
U-Net |
78.60 |
71.84 |
98.7 |
29.5 |
|
Attention U-Net |
81.22 |
74.31 |
99.0 |
25.8 |
|
DA-TransUNet |
82.90 |
75.60 |
99.2 |
24.5 |
|
DeepLabV3+ |
80.10 |
73.11 |
99.1 |
27.0 |
|
U-Net/MFO |
81.41 |
75.02 |
99.3 |
24.8 |
|
U-Net/WOA |
82.21 |
75.54 |
99.4 |
23.5 |
|
DeepLabV3+/MFO |
85.40 |
77.80 |
99.7 |
21.2 |
|
DeepLabV3+/WOA |
84.50 |
76.40 |
99.6 |
21.5 |
|
Breast Ultrasound |
||||
|
U-Net |
75.75 |
62.18 |
98.4 |
28.9 |
|
Attention U-Net |
76.87 |
63.91 |
98.1 |
26.3 |
|
DA-TransUNet |
78.10 |
65.70 |
98.6 |
24.8 |
|
DeepLabV3+ |
76.08 |
63.48 |
99.0 |
26.1 |
|
U-Net/MFO |
76.58 |
64.09 |
98.3 |
25.4 |
|
U-Net/WOA |
78.80 |
66.27 |
98.5 |
24.0 |
|
DeepLabV3+/MFO |
77.41 |
65.19 |
98.7 |
23.5 |
|
DeepLabV3+/WOA |
78.68 |
66.98 |
98.8 |
23.0 |
Statistical analysis reinforced these findings. Paired t-tests (n=20 runs) indicated significant improvements (p < 0.01) for all major metrics in our optimized models compared to their respective baselines, confirming the robustness of the observed gains. Bland–Altman analysis (Figure 2: plot comparing Dice scores of baselines DeepLabV3+ vs. proposed methods across all test samples) revealed a mean bias of +5.05% with narrow limits of agreement (4.69 to 5.41), demonstrating consistent, sample-independent improvement. Collectively, these qualitative and statistical evaluations illustrate that our approach not only enhances numerical metrics but also produces clinically meaningful and visually reliable segmentations.
To evaluate the computational efficiency of our framework, we compare key metrics, including inference latency, memory consumption, and model size, in Table 2.
Table 2. Computational efficiency comparison of the proposed methods and baseline models on a 256×224 size. All models use ResNet-50 as a backbone.
|
Model |
Param (M) |
FLOPs (GF) |
Latency (ms) |
FPS |
Model Size (MB) |
|
U-Net |
32.5 |
9.37 |
8.55 |
116.9 |
124.4 |
|
DeepLabV3+ |
26.8 |
18.54 |
9.05 |
110.5 |
107.09 |
|
U-Net/MFO |
~62.0 |
~23.04 |
~21.8 |
~82.6 |
~245 |
|
U-Net/WOA |
~62.0 |
~23.04 |
~23.5 |
~81.1 |
~245 |
|
DeepLabV3+/MFO |
~56.3 |
~32.21 |
~23.0 |
~76.9 |
~225.6 |
|
DeepLabV3+/WOA |
~56.3 |
~32.21 |
~25.5 |
~74.1 |
~225.6 |
5. Conclusion
In this work, we presented a framework for dynamic, training-time, fitness-driven, fuzzy-adaptive channel selection using MFO and WOA. Crucially, our method enhances segmentation performance without modifying network architecture or requiring retraining, enabling transparent, plug-and-play integration with existing backbones such as U-Net and DeepLabV3+. By dynamically emphasizing discriminative channels and suppressing redundancy, guided by activation-based fitness and stabilized through fuzzy inference, the framework achieves consistent gains in accuracy and interpretability across diverse medical imaging tasks.
Extensive experiments on two clinically relevant datasets, AD MRI (hippocampal segmentation) and breast ultrasound (tumor segmentation), demonstrate the robustness and generalizability of our framework. Our best-performing variant, DeepLabV3+ + Fuzzy-MFO, achieved a Dice score of 85.4% and sensitivity of 99.7% on hippocampal segmentation, surpassing the baseline by 5.3% and 0.6%, respectively. On breast ultrasound images, DeepLabV3+ + WOA achieved 78.68% Dice, reflecting strong adaptability to noisy and heterogeneous imaging conditions. Ablation studies confirmed that the fuzzy adaptation mechanism contributes an additional 2.0% Dice improvement over vanilla MFO, highlighting its critical role in stabilizing channel selection.
Nevertheless, this study has limitations. First, our current implementation operates on 2D axial slices, which may neglect 3D anatomical context essential for volumetric consistency, particularly in longitudinal hippocampal analysis. Second, while we evaluated on two clinically relevant datasets, validation remains limited to single-center cohorts; multi-scanner and multi-institutional data are needed to assess robustness against domain shift. Third, the metaheuristic optimization introduces additional computational overhead during training, though inference remains efficient and unchanged.
Future work will address these gaps through three key directions: (1) extending the framework to full 3D volumetric segmentation to capture spatial coherence; (2) integrating it with transformer-based architectures (e.g., SwinUNETR) to combine global attention with dynamic channel selection; and (3) deploying it in federated learning settings to enable privacy-preserving model enhancement across hospitals without sharing sensitive patient data. These steps will further bridge the gap between algorithmic innovation and real-world clinical deployment.
References
[1] Wu J, Xu M, editors. One-prompt to segment all medical images. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition; 2024. View Article
[2] Ma J, He Y, Li F, Han L, You C, Wang B. Segment anything in medical images. Nature Communications. 2024;15(1):654. View Article
[3] Yousefi-Banaem H, Malekzadeh S. Hippocampus segmentation in magnetic resonance images of Alzheimer's patients using Deep machine learning. arXiv preprint arXiv:2106.06743. 2021. View Article
[4] Ferreira D, Verhagen C, Hernández-Cabrera JA, Cavallin L, Guo C-J, Ekman U, et al. Distinct subtypes of Alzheimer's disease based on patterns of brain atrophy: longitudinal trajectories and clinical applications. Scientific Reports. 2017;7(1):46263. View Article
[5] Houssein EH, Mohamed GM, Djenouri Y, Wazery YM, Ibrahim IA. Nature inspired optimization algorithms for medical image segmentation: a comprehensive review. Cluster Computing. 2024;27(10):14745-66. View Article
[6] Yuan Y, Cheng Y. Medical image segmentation with UNet-based multi-scale context fusion. Scientific Reports. 2024;14(1):15687. View Article
[7] Park JS, Fadnavis S, Garyfallidis E. Multi-scale V-net architecture with deep feature CRF layers for brain extraction. Communications Medicine. 2024;4(1):29. View Article
[8] Guan H, Yap P-T, Bozoki A, Liu M. Federated learning for medical image analysis: A survey. Pattern Recognition. 2024;151:110424. View Article
[9] Mirjalili S. Moth-flame optimization algorithm: A novel nature-inspired heuristic paradigm. Knowledge-Based Systems. 2015;89:228-49. View Article
[10] Mirjalili S, Lewis A. The whale optimization algorithm. Advances in Engineering Software. 2016;95:51-67. View Article
[11] Xu H, Liu W-d, Li L, Yao D-j, Ma L. FSRW: Fuzzy logic-based whale optimization algorithm for trust-aware routing in IoT-based healthcare. Scientific Reports. 2024;14(1):16640. View Article
[12] Ronneberger O, Fischer P, Brox T, editors. U-net: Convolutional networks for biomedical image segmentation. International Conference on Medical Image Computing and Computer-Assisted Intervention; 2015: Springer. View Article
[13] Cao L, Zhang Q, Fan C, Cao Y. Not Another Dual Attention UNet Transformer (NNDA-UNETR). Quantitative Imaging in Medicine and Surgery. 2024;14(12):9169. View Article
[14] Boreiri Z, Azad AN, Ghodousian A, editors. A convolutional neuro-fuzzy network using fuzzy image segmentation for acute leukemia classification. 2022 27th International Computer Conference, Computer Society of Iran (CSICC); 2022: IEEE. View Article
[15] Chen L-C, Zhu Y, Papandreou G, Schroff F, Adam H, editors. Encoder-decoder with atrous separable convolution for semantic image segmentation. Proceedings of ECCV; 2018. View Article
[16] Dhiyanesh B, Vijayalakshmi M, Saranya P, Viji D. EnsembleEdgeFusion: advancing semantic segmentation in microvascular decompression imaging. Scientific Reports. 2025;15(1):17892. View Article
[17] Norouziazad A, Homam B, Feygin A, Najafpour Ghazvini Fardshad M, Rozenblat S, Matinpour A, et al. Optimized DeepLabV3+… Advanced Intelligent Systems. 2025:2500282. View Article
[18] Norouziazad A, Esmaeildoost F, Homam B, Salahandish R, editors. Grey Wolf Optimizer Enhances Adaptive Atrous Spatial Pyramid Pooling… COMPSAC 2025: IEEE. View Article
[19] Norouziazad A, Fardshad MNG, Salahandish R. Adaptive Student's T-Loss… (No DOI provided). View Article
[20] Ghodousian A, Amiri H, Azad AN. On the resolution and Linear programming problems subjected by Aczel-Alsina Fuzzy relational equations. arXiv:2204.11273. View Article
[21] Norouziazad A, Matinpour A, Deljoo F, Homam B, Trivedi B, Salahandish R. Breast Cancer Segmentation Using a Modified U-Net… WCEECSS 2025. View Article
[22] Chen J, Lu Y, Yu Q, Luo X, Adeli E, Wang Y, et al. TransUNet: Transformers make strong encoders… arXiv:2102.04306. View Article
[23] Chen J, Mei J, Li X, Lu Y, Yu Q, Wei Q, et al. TransUNet… Medical Image Analysis. 2024;97:103280. View Article
[24] Xiao H, Li L, Liu Q, Zhu X, Zhang Q. Transformers in medical image segmentation: A review. Biomedical Signal Processing and Control. 2023;84:104791. View Article
[25] Gu R, Wang G, Song T, Huang R, Aertsen M, Deprest J, et al. CA-Net… IEEE Transactions on Medical Imaging. 2020;40(2):699-711. View Article
[26] Fang W, Han X-h, editors. Spatial and channel attention modulated network… ACCV 2020. View Article
[27] Zhang J, Chen X, Yang B, Guan Q, Chen Q, Chen J, et al. Advances in attention mechanisms… Computer Science Review. 2025;56:100721. View Article
[28] Zhao H, Zhang Y, Liu S, Shi J, Loy CC, Lin D, et al. Psanet… ECCV 2018. View Article
[29] Zhu Y, Han G, Zhu H, Zhang F. Feature Description Attention… Engineering Applications of Artificial Intelligence. 2025;161:112139. View Article
[30] Tao C, Chen H, Wu R, Zhi H, Yan X, Liu H, et al., editors. FCAU-Net… 2023 IEEE Smart World Congress. View Article
[31] Rahman MM, Munir M, Marculescu R, editors. Emcad… CVPR 2024. View Article
[32] Qiong L, Chaofan L, Jinnan T, Liping C, Jianxiang S. Medical image segmentation… Scientific Reports. 2025;15(1):2833. View Article
[33] Sun G, Pan Y, Kong W, Xu Z, Ma J, Racharak T, et al. DA-TransUNet… Frontiers in Bioengineering and Biotechnology. 2024;12:1398237. View Article
[34] Zhang M, Zhang Y, Liu S, Han Y, Cao H, Qiao B. Dual-attention transformer-based… Scientific Reports. 2024;14(1):25704. View Article
[35] Yuan Y, Zhang H, Xiong Z, Qin J, Xu D, editors. DAPFormer… ICGIP 2023; 2024: SPIE. View Article
[36] Sahoo SK, Saha AK, Ezugwu AE, Agushaka JO, Abuhaija B, Alsoud AR, et al. Moth flame optimization: theory… Archives of Computational Methods in Engineering. 2023;30(1):391-426. View Article
[37] Vasan SS, Jayalakshmi P, editors. Enhancing Alzheimer's Disease Detection with Chaotic Moth Flame Optimization… ICETCS 2024. View Article
[38] Abu Khurmaa R, Aljarah I, Sharieh A. An intelligent feature selection approach based on moth flame optimization… Neural Computing and Applications. 2021;33(12):7165-7204. View Article
[39] Fang H, Fan H, Lin S, Qing Z, Sheykhahmad FR. Automatic breast cancer detection… International Journal of Imaging Systems and Technology. 2021;31(1):425-38. View Article
[40] Ay Ş, Ekinci E, Garip Z. A comparative analysis of meta-heuristic optimization algorithms… Journal of Supercomputing. 2023;79(11):11797-826. View Article
[41] Dhakhinamoorthy C, Mani SK, Mathivanan SK, Mohan S, Jayagopal P, Mallik S, et al. Hybrid whale and gray wolf deep learning… Mathematics. 2023;11(5):1136. View Article
[42] Diaz P, Jiju MJE. A Hybrid Moth-Flame Optimization Technique… IJCVIP. 2022;12(1):1-20. View Article
[43] Baimukashev R, Kadyrov S, Turan C. Systematic Survey of Deep Fuzzy Computer Vision… FIE. 2024;16(3):220-43. View Article
[44] Ghodousian A, Azad AN, Amiri H. Log-sum-exp optimization problem… arXiv:2206.09716. View Article
[45] Zhou T, Wang H, Geng S, Ju H, Huang J, Fu F, et al. F2CAU-Net… Applied Soft Computing. 2025:113692. View Article
[46] Ticala C, Pintea CM, Chira M, Matei O. An Innovative Medical Image Analyzer… Medical Sciences. 2025;13(3):97. View Article
[47] Das N, Saha S, Nasipuri M, Basu S, Chakraborti T. Deep-Fuzz… PLOS ONE. 2023;18(6):e0286862. View Article
[48] Fuentes-Tomás J-A, Mezura-Montes E, Acosta-Mesa H-G, Márquez-Grajales A. Tree-based codification… IEEE TEVC. 2024;28(3):597-607. View Article
[49] Dinsdale NK, Jenkinson M, Namburete AI. STAMP… Medical Image Analysis. 2022;81:102583. View Article
[50] Bosma MM, Dushatskiy A, Grewal M, Alderliesten T, Bosman P, editors. Mixed-block neural architecture search… Medical Imaging 2022: Image Processing; 2022: SPIE. View Article