U.S. Geological Survey, Techniques and Methods 6–B5
By Donna S. Francy and Robert A. Darner
In Cooperation with the Cuyahoga County
Board of Health, Northeast Ohio Regional Sewer District, Ohio Water Development
Authority, and Ohio Lake Erie Office
http://pubs.water.usgs.gov/tm6b5/
Abstract
State recreational water-quality standards are based on concentrations of indicator organisms, such as Escherchia coli (E. coli). Because the analytical methods for enumerating E. coli take at least 18–24 hours to complete, some agencies have turned to predictive modeling to obtain near-real-time estimates of recreational water quality. The USGS has been working with local agencies to develop empirical predictive models for five Lake Erie beaches in Ohio. One beach, Huntington, is used as example in this report to describe in a step-by-step fashion how data for models were collected and how models were developed and evaluated. These steps are not the only procedures that can be used to develop predictive models for beaches; rather, they are the methods used by the authors for the reported datasets.
The steps to develop predictive models are data collection; exploratory data analysis; model development, selection, and diagnosis; determination of model output values; and model validation and refinement. For Huntington, the predictive model was based on data collected during the recreational seasons of 2000–2004. The explanatory variables were wave height, weighted rainfall in the past 48 hours, and log10 turbidity; the model explained 38 percent of the variability in E. coli concentrations. Two outputs from the model were calculated: (1) the predicted E. coli concentration and (2) the probability that the E. coli single-sample maximum bathing-water standard of 235 colony-forming units per 100 milliliters (CFU/100 mL) will be exceeded. A threshold probability of 29 percent was established for the Huntington 2000–2004 model. The threshold probability is the probability associated with too great a risk to allow swimming and is established by examining historical data. The model was validated in 2005 and yielded more correct responses and better predicted exceedance of the bathing-water standard than did the current method for assessing recreational water quality (using the previous day’s E. coli concentration).
The procedures described in this report
can be used to develop and test predictive models at other beaches. Predictive
modeling is a dynamic process meant to augment existing beach-monitoring
programs, not to replace them. Models should be continuously validated
and refined to improve predictions and better protect public health. If
validation tests are successful, a beach manager may decide to develop
an Internet-based system that provides model predictions to the beach-going
public. This type of system, called “nowcasting,” was implemented at
Huntington on May 30, 2006.