PTOOLS - Performance Tools for Reproducible Benchmarking and Quality Assurance 1. Overview 2. Installation and Prerequisites 3. Tests 4. Sample Usage 5. Options and Switches 6. Data Input File Format 7. Change History 1. OVERVIEW PTOOLS help facilitate reproducible performance testing, quality assurance and benchmarking of optimization software. The tools automate the task of performance analysis and visualization, taking into account various performance an quality measures. The server provides online tools for measuring solver: * robustness * efficiency * quality of solution The tools are also available via the web-based PAVER Server at http://www.gamsworld.org/performance/paver/ 2. INSTALLATION AND PREREQUISITES To use the tools, you will need the following software a) GAMS Modeling System: Download at http://www.gams.com/download/ You do not need a GAMS license in order to use ptools. Installation instructions are available at http://www.gams.com/docs/document.htm b) Gnuplot: Download at http://www.gnuplot.info/ and install according to instructions at the Gnuplot website. You will need to add GAMS and Gnuplot to your environment PATH variable. On Windows, you can type 'gams' and 'pgnuplot.exe' at the command prompt to see if your environment is set up correctly. In most UNIX environments, you can type 'gams' and 'gnuplot'. The tools have been tested using GAMS 22.8 on Windows Vista, Mac OS 10.4, and SuSE Linux but should work on most UNIX flavors. Unzip the ptools.zip file. You should have a directory structure as follows: ptools-x.y/ ptools-x.y/sample/ ptools-x.y/test/ The actual ptools files are located in the root ptools-x.y/ directory. The subfolder ptools-x.y/sample/ consists of sample benchmark data (trace) files. The ptools-x.y/test/ contains test scripts to run the sample benchmark data input (trace) files. 3. TESTS The ptools contain sample tests to illustrate sample benchmark and performance analyses. Sample tests are available for a variety of model types. We will illustrate using the lp.gms test case for LP models. Running a single test: GAMS IDE on Windows: a) Create a GAMS project under ptools-x.y/test/. b) Open the file ptools-x.y/test/lp.gms c) Run the file, which should generate a result folder under lpresults/ d) Open the file ptools-x.y/test/lpresults/results.htm in a web browser All Other Environments: a) Change directories to ptools-x.y/test/ b) At the command prompt, type gams lp which should generate a result folder under lpresults/ c) Open the file ptools-x.y/test/lpresults/results.htm in a web browser A variety of test cases exist, including lp.gms: Analysis of LP benchmarking data mip.gms: Analysis of MIP benchmarking data nlp.gms: Analysis of NLP benchmarking data minlp.gms: Analysis of MINLP benchmarking data Using the ptools for these cases is similar. For example, running 'gams nlp.gms' runs the analysis for NLP benchmarking data with results available at nlpresults/results.htm. See the GAMS files (lp.gms, mip.gms, nlp.gms, or minlp.gms) for details. Running all tests at once: A script to run all tests at once is available (test.gms). Run gams test.gms 4. SAMPLE USAGE In order to use the ptools, copy your trace benchmarking data files to the ptools-x.y/ directory. The main tool is pprocess.gms. The pprocess.gms and other tools contain a variety of options described in the files themselves. a) Basic usage: For default options, and tracefiles trace1.trc trace2.trc and trace3.trc, just run gams pprocess.gms --trace1=trace1.trc --trace2=trace2.trc --trace3=trace3.trc For example, copy the files lp1.trc, lp2.trc and lp3.trc from ptools-x.y/sample/ to ptools-x.y/. Then change directories to ptools-x.y/ and run gams pprocess.gms --trace1=lp1.trc --trace2=lp2.trc --trace3=lp3.trc By default, results are generated under results/. Open results/results.htm to view the results. b) Basic usage with parameter file: If you have several options the command can become very long, in which case it is useful to use a parameter file. The same command as above can be run as follows: Generate a parameter file, which is just a text file with commands, called parm.opt with each command line -- option: --trace1=lp1.trc --trace2=lp2.trc --trace3=lp3.trc You can name the parameter file any name but we choose parm.opt. Then just run gams pprocess.gms parmfile=parm.opt c) Change the default result directory: This is accomplished via the --outdir option. Suppose we are comparing LP simplex solvers and want to store results under the directory simplex/. Create a file parm.opt with entries --trace1=lp1.trc --trace2=lp2.trc --trace3=lp3.trc --outdir=simplex gams pprocess.gms parmfile=parm.opt d) Change tolerances and other options: Suppose we want to change the resource time comparison criteria for which solvers are considered the same (--tsame) and much faster (--tfaster). The options --trace1=lp1.trc --trace2=lp2.trc --trace3=lp3.trc --outdir=custom --tsame=5 --tfaster=25 mean resource times are considered the same, if they are within 5% of eachother. A solver is considered faster than another, if it is less than 25% faster than the other. A solver is considered much faster than another, if it is more than 25% faster than the other. gams pprocess.gms parmfile=parm.opt 5. OPTIONS AND SWITCHES The ptools have a growing number of options and flags that influence its behavior and kind of output. a) --useobjest and --gaptol: evaluating solver performance by optimality gap Global Solvers for MIP, NLP, and MINLP models often provide (next to an incumbent solution with an objective function value) a bound on how much the incumbents objective function value is away from the global optimal value. In GAMS, this value is called objective estimate. Since global solvers aim to minimize the gap between the incumbent objective value and the objective estimate, it is useful to incorporate this gap into the performance measurement. Thus, specification of the flag --useobjest=1 will let the ptools - plot additional performance profiles where models are considered solved only if the gap is closed or below 10%, resp., - add extra columns with information about objective estimates and gap (added to the last table in the solver square summary), - perform additional checks for inconsistencies related to the reported objective estimate. Since one might not let the solver solve a model until the gap is exactly zero but allows a relative tolerance, one can specify such a tolerance in ptools with the --gaptol option. The tests mipobjest, mipnumnodes, and minlpincon demonstrate the --useobjest option. Since the tracefiles for the minlpobjest test have been generated by running global MINLP solvers with a gap tolerance of 1%, we also use the --gaptol option there. The test minlpincon includes tracefiles where the tracedata of one solver contains a lot of inconsistencies, e.g., the solver reported optimal without having the gap closed, or it reported an objective estimate that is better than the objective value reported by another solver. As a result, the performance values and some mean values become distorted. We have included this example nonetheless to show the value of the inconsistencies checks. b) --reslim and --meanonopt: mean value computations in timings and solver square summary Sometimes a user may want a single number (metric) that identifies a solver's performance. For that purpose, ptools computes mean values of solver times in the timings and solver square tables. For that purpose, ptools computes a shifted geometric mean where unsolved models or models above the timelimit are accounted with the timelimit. To specify the timelimit, you can use the --reslim option. If not specified, ptools will take the maximum time among all solvers input data. Furthermore , it is not always clear when a model should be considered solved. In the timings table, a model is considered solved by a solver, if a feasible point has been found (according to the reported model and solver status). However, using the --meanonopt switch, you can enable an additional row where mean values are computed based on models marked with an optimal model status (those not optimal are penalized by a max reslim factor). In the last table of the solver square summary, the mean value computation is also based on model status return codes indicating a feasible point was found. However, if the --useobjest option has been specified, then only models where the relative gap between best objective value and objective estimate is below the gap tolerance are considered as solved. The tests mipobjest, mipnumnodes, and minlpincon demonstrate this option. Since the minlpincon example is based on tracefiles where one solver declares models as solved to optimality even though the gap is not closed, the additional row in the timings table gives a wrong impression of the solvers performance. c) --colselect: compare solver performances by a user defined measure Most often, solvers are compared by the time that they spend solving a model. However, in some cases, one would like to compare solvers using a different measure. Using the --colselect option, you can specify a column in the tracefiles that should be used. The data from this column will then be printed as an additional column in the timings table, including mean value computations, and the performance profiles are printed w.r.t. the data in this column. By default, --colselect is the resource time for the performance profiles. For the timings table, --colselect is ETSolver by default. This column is not included in standard tracefiles (traceopt=3), but can be added by the user (see also the next section). ETSolver stands for the "wall clock time" that a solver spend on solving a model. The tests nlpcolselect, mipnumnodes, and minlpwalltime2 demonstrate this option. In the nlpcolselect example, an NLP has been run with different linear algebra subsystems. The ratio between the complete solver time and the time spend in the linear algebra routines is then used as measurement. In the mipnumnodes example, outcomes of MIP solvers based on branch-and-bound are compared by the number of nodes that they used. Finally, in the minlpwalltime2 example, performance profiles are plotted w.r.t. the solvers wall clock time (ETSolver) instead of the timings reported by the solver. The tracefile data has been chosen such that on some instance one of the solver has wall clock times considerable different from the solver time, see also the inconsistencies check. c) --modellib: linking to models in a library When benchmark data has been computed with models from one of the GAMSWorld model libraries, then the model names in the last solver square summary table and in the timings table can be linked with the GAMS model in the model library. Currently, the values minlplib, globallib, and linlib are supported for the --modellib option. The tests mipobjest, nlpcolsel, mipnumnodes, minlpincon, minlpwalltime1, and minlpwalltime2 demonstrate this option. d) --modelfile restricting set of models that are considered in comparision The test mipsubset demonstrates this option. e) --uselog linking to solver log files f) --allsolver add cumulative solver curves to absolute performance profiles 6. DATA INPUT FILE FORMAT The ptools accept a format we call "trace" format which is a comma-delimited file with one record (line) per model solved. Each solver has a separate trace file for models solved. The first lines of each trace file consists of comment fields which are denoted by a * as well as descriptions of the available columns in the file. The minimal column definition required is: InputFileName ModelType SolverName Direction ModelStatus SolverStatus ObjectiveValue SolverTime A sample trace file with 3 data records (models solved) may look like this: * Trace Record Definition * InputFileName,ModelType,SolverName,Direction,ModelStatus,SolverStatus,ObjectiveValue,SolverTime aa01,LP,LPsolver1,0,1,1,5.55354363882243E+04,1.55800000000000E+01 aa03,LP,LPsolver1,0,1,1,4.96163636363636E+04,7.70000000000000E+00 air02,LP,LPsolver1,0,1,1,7.64000000000000E+03,4.80000000000000E-01 Within GAMS, trace files can be generated automatcally using the "traceopt=3 trace=[filename]" command line options. This format contains additional information that users may find useful, but is not required for using ptools. The GAMS traceopt=3 format may include an additional comment line, for example * Trace Record Definition * GamsSolve * InputFileName,ModelType,SolverName,OptionFile,Direction, ... If no comments exist at the begining of the file as described above, then it is assumed the trace file is in traceopt=3 format. 7. CHANGE HISTORY 2008/11/07 - New time profiles listing percentage of models solved by time. - Addition of super solvers to time profiles (ALL_PAR: solver consisting of all solvers run in parallel. Solver time used is the max over all solver times, ALL_SEQ=: solver consisting of all solvers run in sequence. Solver time used is the sum of all solver times, CAN_SOLVE: solver consisting of all solvers. Solver time used is the minimum of all solver times for a given model. ) - Computation of (shifed) geometric mean of resource times for models where a feasible solution was found. - Computation of percentage of models unsolved. - Enhanced timing information. - Available as downloadable zip with sample data files and sample tests. The tools were previously available only via the PAVER server.