Image Cleanup Site Map Service User Guide



User Comment

Quick Search
Advanced Search »

Data Standard

Modified on 11/12/2009 07:28 PM by msowinski Categorized as Manage
All of the data collected is transformed into a common data standard. This section introduces the standard, defines the schema, and reviews the transformation process and services.

1 Introducing Common Environmental KML (CEK)

The Cleanup Site Map Service (CSMS) aggregates data from many sources to report site information using common terms. This process relies upon use of EPA and other federal data standards (where available) that are translated into a common data standard for mapping, called Common Environmental KML (CEK). The naming inspiration came from initial application of the service to Google Earth which applied KML (e.g. Keyhole Markup Language).

The CSMS relies on the federal and state agencies to serve as a data repository, while the focus of the CSMS is to serve as a map-enabled pointer to link to richer data from the data owners (e.g., states that provided the data). Therefore, by design, the CEK is a simple schema that contains many fewer data elements than an agency's database.

2 The CEK Data Schema


The collected data is currently transformed into two tables: a facility table and a dataset table. A third geometry table is planned when perimeters (e.g. polygons) of institutional or engineering controls are collected by the CSMS. The schema may be viewed as a spreadsheet Web Spreadsheet where each tab represents a separate table. The tables are as follows:

2.1 Facility Information Table

This table has one record per facility, and therefore a certain dataset may have upwards of 45,000 records. In general, the facility dataset encompasses data elements that form the facility name, location both as a physical address and in latitude and longitude, an agency's URL to a web page for facility information, institutional or engineering control information and various facility identification numbers including both a state and federal number. A few system data elements (shown in gray) augment the state data. While upwards of 25 data elements are contemplated in a table, typically 10 fields are collected from an agency.

2.2 Dataset Information Table

This table has one record per dataset. This provides the description of the dataset, the agency contacts associated with the dataset, applicable agency URLs including feedback.

2.3 Facility Multi-Geometry Information Table

This is a preliminary rendering of the schema that would be applied to introduce additional geometric features. The table anticipates that one facility may have multiple geographic features. For example, it is common for a facility to incorporate multiple institutional or engineering controls to protect an environmental remedy.

Geographic Classes:
  • Institutional Controls (yellow)
  • Engineering Control (blue)
  • Operable Unit Boundaries (optional)
  • Site Boundary (area shown to have site related contamination. A site with a migrating plume could enlarge, however there is variation in application)(optional)
  • Area of Contamination (may need a media class: soil, surface water, sediment, and groundwater) (brown)
  • Area in Reuse (This would be a ready for reuse area)
  • Renewable Layer (hold)

Click on the polygon to display information:
  • Mimic the existing balloon for a site icon
  • Focus on properties of the feature

2.4 Next Steps in Schema Development

Understanding of user scenarios will guide the development of this schema. Structuring Activity and Use Limitations within Institutional Controls will guide local government, while structuring Chemicals of Concern will aide use of the service by contractors developing health and safety plans.

  • Activity and Use Limitations. The CEK schema should accept multiple categories of activity and use limitations reflected within institutional controls for a facility. Representative restrictions include prohibitions on uses such as residential, or prohibitions on activities like excavation. Several institutional control web repositories, such as Washington and California, now have developed look-up tables for activity and use limitations. The CEK should accept and allow mapping to a common schema like now occurs for status code.The effort here is toward the development of a national AUL status lookup table. This has not been formed, but we are working to establish look-up tables that are in process in states and other agencies. The following spreadsheet shows AUL lookup tables from Washington, California and USEPA Superfund. AUL Lookup Tables

3 Data Transformation and Processing

While the core facility data is not modified, the data is transformed as it is entered into the CEK. An incoming dataset might be in various formats including ESRI shapefile, web pages, Microsoft Excel or Access. These incoming datasets are imported into a common MSSQL format. Once imported, the data is mapped to the CEK schema, the simplified red, yellow and green status codes are generated, and when necessary geocoding is provided.

3.1 Mapping to the CEK Standard

3.2 Formation of Status Codes

3.3 Geocoding When Latitude and Longitude is Missing

All content developed and maintained for USEPA by Terradex, Inc.