This dialog is activated by selecting the command Fit Wizard... from the Analysis Menu. This command is active if a plot or a table window is selected. In the latter case, this command first creates a new plot window using the list of selected columns in the table.
This dialog is used to fit discrete data points with a mathematical function. The fitting is done by minimizing the least square difference between the data points and the Y values of the function.
The top of the dialog box is used to select one of 4 Categories of functions: 1) user defined functions which have been previously saved, 2) the classical functions provided by QtiPlot in the analysis menu, 3) simple elementary (basic) built-in functions, and 4) external functions provided via plugins.
To choose a function, first select a category and then the desired function from the displayed Function list. Clicking on the checkbox under the selector will clear the contents of the function entry text pane (see below) and copy the selected function into it. You can also click the Add expression button to copy the selected function, but this will not clear any previous contents. If you've selected one of the "basic" functions, there will be no checkbox and you will need to use the Add expression button.
The bottom half of the dialog box allows you to define your own function. You can either write you own mathematical expression from scratch or add expressions from the function selector with the Add expression button. Once a custom expression is completed, clicking on the Save button will add the function to the list of user defined functions. The Name field will be used as the name of the function. A copy of the function is saved on disk with the extension ".fit". You can define the folder where ".fit" files are saved using either the Choose models folder... button (shown only when "User defined" is selected) or by selecting a new folder in the Save file dialog. Functions can be removed from the User Defined list by selecting them and clicking on the Remove button. You will be asked to confirm the deletion.
The second step is to define the parameters for the fit. You have to give an initial guess for the fitting parameters.
Figure 5-87. The second step of the Fit Wizard... dialog box.
In this second tab you can also choose a weighting method for your fit (the default is No weighting). The available weighting methods are:
No weight: all weighting coefficients are set to 1 (wi = 1).
Instrumental: the values of the associated error bars are used as weighting coefficients wi = 1/eri2, where eri are the error bar sizes stored in error bar columns. You must add Y-error bars to the analyzed curve before performing the fit.
Statistical: the weighting coefficients are calculated as wi = 1/yi, where yi are the y values in the fitted data set.
Arbitrary Dataset: allows setting the weighting coefficients using an arbitrary data set wi = 1/ci2, where ci are the values in the arbitrary data set. The column used for the weighting must have a number of rows equal to the number of points in the fitted curve.
Direct Weighting: allows setting the weighting coefficients using an arbitrary data set wi = ci, where ci are the values in the arbitrary data set. The column used for the weighting must have a number of rows equal to the number of points in the fitted curve.
After the fit, the log window is opened to show the results of the fitting process.
Depending on the settings in the Custom Output tab, a function curve (option Uniform X Function) or a new table (if you choose the option Same X as Fitting Data) will be created for each fit. The new table includes all the X and Y values used to compute and to plot the fitted function and is hidden by default. It can be found in and viewed from the project explorer.
Figure 5-88. The third step of the Fit Wizard... dialog box.
The controls in the Parameters Output group box can be used in order to define the options for the display of the results from data fit operations. The Format list box allows to choose a default numerical format. The value of the Significant Digits option can be used to customize the precision of the output and has a different meaning depending on the numeric format. The following format and precision options are available:
- Decimal or scientific e-notation, whichever is the most concise. The value of the Significant Digits control represents the maximum number of significant figures in the output (trailing zeroes are omitted).
- Decimal notation. The value of the Significant Digits control represents the number of digits after the decimal point.
- Scientific e-notation where the letter
e is used to represent "times ten raised to the power of" and is followed by the value of the exponent. The value of the Significant Digits control represents the number of digits after the decimal point.
This dialog tab also provides controls that can be used to evaluate the goodness of fit. The first is the Residuals Plot button which is used to display the curve of the plot residuals. The Conf. Bands and Pred. Bands buttons can be used to generate confidence and prediction limits for the current fit, based on the user input confidence value.
By default, reported errors are not automatically scaled by the square root of the reduced chi-squared value. You can choose to enable this option by checking the Scale errors with sqrt(Chi^2/doF) box.
After the fit, a series of fit statistics are displayed in the log window allowing evaluation of the goodness of fit. These values are:
is the sum of squares of residuals. This statistic measures the total deviation of the response values from the fit to the response values. It is also called the Sum of Squares due to Error (SSE). A small RSS indicates a tight fit of the model to the data.
The reduced chi-square is obtained by dividing the residual sum of squares (RSS) by the degrees of freedom (doF), which is defined as the number of response values minus the number of fitted coefficients estimated from the response values. Although this is the quantity that is minized during the iterative process, it is typically not a good measure for the goodness of fit. For example, if the y data is multiplied by a scaling factor, the reduced chi-square will be scaled as well.
is defined as 1 - RSS/SST, where SST is the total sum of squares. This statistic measures how successful the fit is in explaining the variation of the data. Put another way, R-square is the square of the correlation between the response values and the predicted response values. It is also called the square of the multiple correlation coefficient and the coefficient of multiple determination.
R-square can take on any value between 0 and 1, with a value closer to 1 indicating that a greater proportion of variance is accounted for by the model. For example, an R-square value of 0.8234 means that the fit explains 82.34% of the total variation in the data about the average.
If you increase the number of fitted coefficients in your model, R-square will increase although the fit may not improve in a practical sense. To avoid this situation, you should use the degrees of freedom adjusted R-square statistic described below.
Note that it is possible to get a negative R-square for equations that do not contain a constant term. Because R-square is defined as the proportion of variance explained by the fit, if the fit is actually worse than just fitting a horizontal line then R-square is negative. In this case, R-square cannot be interpreted as the square of a correlation. Such situations indicate that a constant term should be added to the model.
The adjusted R-square statistic is generally the best indicator of the fit quality when you compare two models that are nested - that is, a series of models each of which adds additional coefficients to the previous model. The adjusted R-square statistic can take on any value less than or equal to 1, with a value closer to 1 indicating a better fit. Negative values can occur when the model contains terms that do not help to predict the response.
This statistic is also known as the fit standard error and the standard error of the regression. It is an estimate of the standard deviation of the random component in the data and is defined as the square root of RSS divided by the degrees of freedom. Just as with RSS, an RMSE value closer to 0 indicates a fit that is more useful for prediction.