## Abstract

Minimal model analysis of glucose and insulin data from an IVGTT (intravenous glucose tolerance test) is widely used to estimate insulin sensitivity; however, the use of the model often requires intervention by a trained operator and some problems can occur in the estimation of model parameters. In the present study, a new method for minimal model analysis, termed GAMMOD, was developed based on genetic algorithms for the estimation of model parameters. Such an algorithm does not require the fixing of initial values for the parameters (that may lead to unreliable estimates). Our method also implements an automated weighting scheme not requiring manual intervention of the operator, thus improving the usability of the model. We studied a group of 170 women with a history of previous gestational diabetes. Results obtained by GAMMOD were compared with those obtained by MINMOD (a traditional gradient-based algorithm for minimal model analysis). Insulin sensitivity by GAMMOD was (3.86±0.19) compared with (4.33±0.20)×10^{−4} μ-units·ml^{−1}·min^{−1} by MINMOD; glucose effectiveness was 0.0236±0.0005 compared with 0.0229±0.0005 min^{−1} respectively. The difference in the estimation by the two methods was within the precision expected for such metabolic parameters and is probably of no clinical relevance. Moreover, both the coefficient of variation of the estimated parameters and the error of fit were generally lower in GAMMOD, despite the fact that it does not require manual intervention. In conclusion, the GAMMOD approach for parameter estimation in the minimal model provides a reliable estimation of the model parameters and improves the usability of the model, thus facilitating its further use and application in a clinical context.

- evolutionary algorithm
- genetic algorithm
- glucose effectiveness
- insulin sensitivity
- mathematical modelling
- minimal model

## INTRODUCTION

The minimal model is used widely to analyse glucose and insulin data from IVGTTs (intravenous glucose tolerance tests) for estimating *S*_{I} (insulin sensitivity) in both clinical and epidemiological studies [1–4]. The major component of strategies for estimating model parameters use traditional non-linear least-squares algorithms; however, these algorithms might lead to estimates that are incorrect or at least characterized by wide uncertainty [4–6]. As assessed by Godsland and Walton [7], the choice of strategy for parameter estimation is a fundamental issue, because it may lead to a failure in convergence of the solution of the equations. Moreover, different initial values of the parameters to be estimated could produce different solutions [8].

The search of a robust method for parameter estimation in minimal model analysis led us to select GAs (genetic algorithms), which are theoretically and empirically proven to provide a robust search in a complex space [9–11]. GAs are random search algorithms for non-linear problems based on the rules of natural selection. They do not require the fixing of initial values of the model parameters to be estimated.

Another limitation of the minimal model is that it often requires the intervention of an experienced operator to produce reliable results [12]. A strategy to limit operator intervention would be of great value in improving the usability of the model.

In the present study, we have developed a new method to estimate the parameters of the minimal model in humans based on a GA approach, termed GAMMOD. Also, an automated strategy in the model application was developed to reduce manual intervention by the operator. Our results using GAMMOD were compared with those obtained by MINMOD, the classical traditional algorithm for minimal model analysis [13].

## MATERIALS AND METHODS

### Subjects and test

A group of 170 women with a history of previous gestational diabetes (age, 33.0±0.4 years; body mass index, 26.6±0.4 kg/m^{2}; fasting glucose, 90.4±1.3 mg/dl) was studied 4–6 months after delivery at Department of Internal Medicine III, University of Vienna, Vienna, Austria. All the participants gave written informed consent to the study, which was approved by the local Ethics Committee.

Gestational diabetes was diagnosed according to the criteria of the 4th Workshop Conference of Gestational Diabetes [14], via a 75 g OGTT (oral glucose tolerance test). Women with previous ketoacidosis and/or β-cell antibodies were excluded. No woman was still breast-feeding at the time of the study. Data from some of the subjects have been included in previous studies [15].

Lower and upper limits for fasting glucose were 66 and 211 mg/dl respectively. Thus all of the categories of glucose tolerance were represented in our population.

FSIGT (frequently sampled intravenous glucose tolerance test) was performed after an overnight (10–12 h) fast. Glucose (time, 0–0.5 min; 300 mg/kg of body weight) and then normal insulin (time, 20–25 min: 0.03 international unit/kg of body weight; Humulin R; Eli Lilly) were infused intravenously. Venous blood samples were taken at timed intervals: at fasting, and at 3, 4, 5, 6, 8, 10, 14, 19, 22, 27, 30, 35, 40, 50, 70, 100, 140 and 180 min after glucose infusion.

### Minimal model

The minimal model was introduced by Bergman et al. [1] more than two decades ago. A brief description of its two differential equations is described below. The first equation (eqn 1) describes the rate of change in glucose concentration following the glucose bolus, whereas the second (eqn 2) describes the rate of change of insulin action from a compartment remote from the plasma:
(1)
(2)
where *G*(t) (mg/dl) is the plasma glucose concentration at time *t*, *I*(t) (μ-unit/ml) is the plasma insulin concentration at time *t*, *G*_{b} and *I*_{b} are basal glucose and insulin values respectively, *X*(t) (min^{−1}) is insulin action at time *t*; *S*_{G} is the glucose effectiveness (min^{−1}), i.e. the glucose disappearance rate from plasma without any change in dynamic insulin concentration, *G*_{0} is the value of plasma glucose concentration extrapolated at time zero after glucose injection, and *p*_{2} is a parameter related to the dynamics of insulin action. *S*_{I} (μ-units·ml^{−1}·min^{−1}) is defined as the ability of insulin to enhance glucose disappearance and inhibit glucose production. The parameters to be estimated are given by the vector **P**=(*S*_{G}, *p*_{2}, *S*_{I}, *G*_{0})^{T}.

### Parameter estimation through the traditional algorithm: MINMOD

We estimated the unknown parameter vector **P** by using the minimal model computer program MINMOD [13], which resorts to a traditional weighted non-linear least-squares approach using the Levenberg–Marquardt algorithm, in particular, for the minimization procedure. MINMOD also uses a weighting scheme of the glucose data, where the weights are equal to the inverse of the variance of the measurement errors. A zero-weighting scheme was used for the early part of the glucose time course, where glucose concentration data may be unreliable, as some mixing effects may still be present. It is for the operator to define the time interval where the zero-weighting scheme should be applied for each subject's data. In the MINMOD algorithm, the choice of the initial value of the parameters to be estimated is needed, and again this task often requires the intervention of the operator.

### Parameter estimation through an automated approach based on GAs: GAMMOD

A weighting scheme not requiring operator intervention was implemented to be applied to the early part of the plasma glucose curve, where some mixing effects can be present. In particular, such a scheme was applied to the samples collected in the first 10 min of FSIGT. (i) The first three samples, also including the pre-injection sample, were automatically zero-weighted; and (ii) the subsequent samples were zero-weighted according to the following conditions: (a) if the sample *G*(t_{i}) was lower than *G*(t_{i+1}), then *G*(t_{i}) was zero-weighted, and (b) if [*G*(t_{i})−*G*(t_{i+1})] was >3·[(*G*(t_{i+1})−*G*(t_{i+2})], then *G*(t_{i}) was zero-weighted.

Condition (b) was based on the hypothesis that plasma glucose should decline rapidly and smoothly in the first few minutes after glucose injection. Thus samples that, in this period, remain higher or, in general, much different from the neighbouring samples can be considered unreliable; therefore they were not weighted in the analysis. However, the number 3 in condition (b) was not based on theoretical reasons, but only on our experience with FSIGT analysis.

No weighting scheme was adopted for glucose samples not matching conditions (a) and (b), and on samples after the 10-min interval.

GAs were then applied for the minimization procedure needed to estimate the unknown parameter vector **P**. GAs, first introduced by Holland in 1975 [16], are the best-known class of evolutionary algorithms. The evolutionary techniques mimic the principles of natural selection of living entities. They combine elements of direct and stochastic searches, and exhibit some advantages over other search methods; in particular, the need for a smaller amount of *a priori* knowledge and, especially, fewer assumptions about the characteristics of the search space. In fact, they do not require the fixing of initial values for the estimates, and they usually converge to the global minimum of the error function [17]. Over the last few years, the level of interest in the application of GAs to engineering problems has grown considerably [11]. Moreover, they have been shown to outperform alternative search techniques for difficult problems involving discontinuous, noisy, high-dimensional and multi-modal objective functions [18].

In the present study, each possible solution of the minimization procedure was a ‘chromosome’, which was composed of four ‘genes’, e.g. the four components of the **P** vector. The limits of the parameters to be estimated, defining the borders for the field of existence of the solution, were set according to our experience as follows: *S*_{G} from 0.001 to 0.200 min^{−1}; *p*_{2} from 0 to 1 min^{−1}; *S*_{I} from 0.01 to 15×10^{−4} μ-units·ml^{−1}·min^{−1}; and *G*_{0} from 100 to 600 mg/dl.

After overcoming the problem of the possible dependence of the initial values of the parameters [8], the GA was started with 150 randomly generated chromosomes and all of them were evaluated. In order to do this, an appropriate cost function (the so-called ‘fitness function’) was built up. We considered the standard sum of squares function:
(3)
where *G*_{m}(t_{i}) is the glucose experimental sample at time t_{i}, and *G*_{e}(t_{i}) is the corresponding model estimation obtained by numerical solution and integration of the eqn (1) and (2); *N* is the number of samples measured.

The estimation in each subject's data of the unknown parameter vector **P** through the minimization of function ψ was considered the objective of the GA. However, as GAs seek to maximize the ‘fitness function’, to accomplish the transformation of the minimization problem into a maximization problem the error can simply be subtracted from a large positive constant, as suggested by Karr et al. [19]. Thus the goal was to maximize the function *ff* defined as *ff*=*D*−ψ, where *D* is a large positive constant parameter.

For each chromosome (i.e. possible solution), the ‘fitness function’ was evaluated such that those providing high fitness function values were selected. After the evaluation step, two genetic operators were applied for creating new chromosomes from those selected before: (i) the ‘cross-over’ operator randomly picked up two parents, **P1** and **P2**, and performed a type of interpolation between them by generating a random number Ω [Ω ∈ (0, 1)] and creating two children, **C1** and **C2**, as follows:
(4)
(ii) the ‘non-uniform mutation’ operator randomly picked up one parent and changed one of its genes (i.e. one of the model parameters) on the basis of a non-uniform probability distribution. This operator searched the space of the solution uniformly at the beginning and locally as the current generation approached the maximum number of generations.

The new population was formed by the children (derived from the application of the genetic operators) and from some high-performance individuals of the old population which were re-introduced by a proper selection function, thus avoiding problems of stagnation [18–20]. The previous evaluation step was repeated on the newly created population, and all of the procedure was iterated. To avoid convergence to a local minimum and to find the minimum number of iterations required for proper convergence (i.e. the number of generations necessary to the desired evolution), a sensitivity study was carried out, as described by Morbiducci et al. [8]. A total of 10000 generations was sufficient to obtain an accurate estimation of **P** (results not shown).

The numerical implementation of the complete method was performed with a proper own code using Matlab® (The MathWorks).

### Data analysis

Statistical comparisons between the model parameters estimated by MINMOD and GAMMOD on each individual were performed by applying Deming regression analysis [21].

Results are means±S.E.M., unless otherwise indicated. Precision of estimates is expressed as the CV (coefficient of variation; i.e., fractional S.D.), calculated from the variance values in the main diagonal of the inverse of the Fisher's information matrix.

## RESULTS

Mean glucose and insulin data used in the present study and GAMMOD-derived curve fits are shown in Figure 1. Table 1 summarizes the values of the four model parameters *S*_{G}, *p*_{2}, *S*_{I} and *G*_{0} estimated using the GAMMOD model. The corresponding CVs are also shown. In only one case was an extremely high CV for *p*_{2} and *S*_{I} found (1398 and 138% respectively). In the same case, however, MINMOD performed even worse (CV for *S*_{I} was 373%).

To compare the two parameter-estimation methods, the two most relevant parameters were used, i.e. *S*_{I} and *S*_{G} (Table 2). GAMMOD-estimated *S*_{I} values were slightly lower than the corresponding MINMOD values, and similar results were found for *S*_{G}. However, GAMMOD provided CVs lower than that estimated by MINMOD; in particular, the CV for *S*_{I} was almost half with GAMMOD compared with MINMOD.

To investigate the relationship between the estimation of *S*_{I} and *S*_{G} by the two methods, Deming regression analysis was applied. High *R* values were found for both *S*_{I} and *S*_{G} (*R*=0.94 and *R*=0.86 respectively; Figures 2 and 3) between the two methods. Deming regression analysis also showed that the slope of the regression line for *S*_{I} was virtually coincident with the unit line {slope, 0.971 [95% CI (confidence interval), 0.913–1.029]}; therefore there was no proportional difference in *S*_{I} estimates by the two methods. However, as the intercept of the regression line was different from zero [intercept, −0.345 (95% CI, −0.563 to −0.127)], the two methods differ slightly in *S*_{I} estimates by a constant amount. With regard to *S*_{G}, Deming regression showed that there was neither a proportional [slope, 1.041 (95% CI, 0.903–1.179)] nor constant [intercept, −0.0003 (95% CI, −0.0034 to 0.0028)] difference between the two methods. The Bland–Altman test provided similar findings for both *S*_{I} and *S*_{G} (results not shown).

In terms of goodness of fit, GAMMOD performed better than MINMOD: the sum of square error averaged over all of the study subjects was 608±37 (mg/dl)^{2} in MINMOD but only 290±9 (mg/dl)^{2} in GAMMOD.

## DISCUSSION

The minimal model is the reference method to estimate *S*_{I} and *S*_{G} from IVGTT data; however, the reliability and accuracy of the parameter estimation may strongly depend upon the numerical approach used. Different strategies may yield extremely different parameter estimates and serious numerical problems can occur, such as failure to converge on a solution or emergence of excessively high CVs for the estimated values. A traditional algorithm often used in parameter estimation is the Levenberg–Marquardt algorithm, which is the most popular gradient-based method converging quadratically to the local minimum [22]. This algorithm has been used in the MINMOD tool for minimal model parameter estimation [1,13].

Apart from this traditional approach, other strategies have been tested subsequently. In particular, Bayesian techniques have been shown to offer particular advantages in the estimation of *S*_{I} with the minimal model [23–26], thus providing improvements in the model estimation success rate and in the parameters precision. Pillonetto et al. [5] developed a Bayesian parameter estimation strategy for the minimal model based on the Markov chain Monte Carlo approach. The Bayesian estimation technique is based on the incorporation of *a priori* knowledge into the estimation process and, as noted by Godsland [27], it could be particularly efficient with the minimal model, as knowledge about *S*_{I} is notable. However, the Bayesian approach may have some draw-backs, such as the requirement for remodelling an entire population dataset in the presence of any *a posteriori* changes to an individual in the population [24,27].

A possible alternative is thus represented by GAs. In the present study, we have implemented a new approach for the estimation of minimal model parameters based on GAs, with 150 chromosomes in the initial population and 10000 generations. The modest number of parameters to be estimated, together with the specific genetic operators applied (i.e. ‘cross-over’ and ‘mutation’ operators), made the choice adopted consistent for the population size [10].

In addition to the described numerical difficulties in parameter estimation, another problem exists that limits the usability of minimal models in a clinical setting: in several cases, the intervention of an operator specifically trained in the use of the model is necessary to obtain reliable results [12]. In particular, the choice of the weighting scheme for the first glucose samples of the IVGTT may be critical. Based on our experience with minimal model analysis, we identified an automated weighting scheme procedure that performed well in the majority of cases, thus avoiding manual intervention of the operator. Moreover, any improper choices in the initial parameter values may lead, in some cases, to unreliable solutions. GAs automatically solved this problem, because they do not require fixing of the initial values for the parameters, but only their limits of existence. Our approach based on automated weighting scheme and GAs for numerical parameter estimation was termed GAMMOD. To the best of our knowledge, no other study has applied GAs to minimal model analysis.

Results obtained by GAMMOD were compared with those obtained by the traditional MINMOD approach. The two methods provided very similar estimations for both *S*_{I} and *S*_{G}. The difference in the estimations by the two methods was within the precision expected for such metabolic parameters. Moreover, such differences are probably of no clinical relevance. With regards to *S*_{I} (the parameter of major interest), it is worth noting that the differences between the two methods remained almost the same at any interval of physiological relevance. In fact, the value of 2.8×10^{−4} μ-units·ml^{−1}·min^{−1} can be assumed to be a threshold between impaired and normal *S*_{I} [15], and in the interval 0–2.8×10^{−4} μ-units·ml^{−1}·min^{−1} for *S*_{I} from MINMOD the absolute difference was 12.2±1.9% (55 subjects); in the interval 2.8–6.9×10^{−4} μ-units·ml^{−1}·min^{−1}, where 6.9×10^{−4} μ-units·ml^{−1}·min^{−1} can be assumed to be the threshold above which a subject is certainly not insulin-resistant [28], the absolute difference was 11.9±1.6% (90 subjects); for values >6.9×10^{−4} μ-units·ml^{−1}·min^{−1}, the absolute difference was 12.7±2.8% (25 subjects). Therefore no clinically relevant differences were observed in the complete *S*_{I} range. It must also be noted that some indicators suggest that GAMMOD performs even better than MINMOD, despite the fact that it does not require manual intervention. In fact, both the CVs of the estimated parameters and the error of fit are generally lower. Improved goodness of fit has already been observed in GAs applied to modelling glucose metabolism, i.e. a model of insulin secretion and kinetics based on an OGTT [8]. The only drawback of GAMMOD compared with MINMOD is the higher computational cost; however, as no real-time results are required, this is not a serious limitation.

In conclusion, we have developed a new approach for parameter estimation in the minimal model based on an automated weighting scheme and GAs. Our approach provided a reliable estimation of the model parameters and improved the usability of the model, thus facilitating its further diffusion and application in a clinical setting.

## Acknowledgments

The IVGTT data were obtained by co-operation between Institute of Biomedical Engineering, National Research Council, Padua, Italy and the Department of Internal Medicine III, Division of Endocrinology and Metabolism, University of Vienna, Vienna, Austria. The study was partially supported by a grant from Regione Veneto (DGR 2702/10-09-04), and by the Austrian Nationalbank Jubiläumsfonds (grant 11198). U.M. thanks Professor Enrico Primo Tomasini for his support of the research.

**Abbreviations:**
CI, confidence interval;
CV, coefficient of variation;
FSIGT, frequently sampled intravenous glucose tolerance test;
GA, genetic algorithm;
IVGTT, intravenous glucose tolerance test;
OGTT, oral glucose tolerance test;
SG, glucose effectiveness;
SI, insulin sensitivity

- The Biochemical Society