You are here

The Understanding America Study Data Pages provide access to collected data to all registered users, for all data sets except those currently under embargo. (Register here.) Non-registered users may explore the fielded UAS surveys and their individual variables (at https://uasdata.usc.edu/surveys) but cannot view or download data until registered. 

Surveys are designed by research teams around the world; programmed and tested by our team at the Center for Economic and Social Research, translated into Spanish and then fielded. Data quality is the primary responsibility of the client. Clients have the opportunity to test programmed surveys before they go into the field. Clients may also review sample test data and run a small pilot of their study, should they be interested in doing so. Best practices are encouraged in survey structure, including the avoidance of "don’t know" as an option whenever possible, and other data validations checks, such as numerical ranges, and targeted soft and hard checks with appropriate error warning and information messages.

Several types of data are available for each UAS survey. A full description of UAS data is available in the UAS Data Pages Data Guide.

For download (generated nightly):

  • Survey data file with collected data (in STATA or CSV format). Data files provide "clean" data, that is, answers given to questions that are not applicable anymore at survey completion (for example because a respondent went back in the survey and skipped over a previously answered question) are treated as if the questions were never asked. In the data files all questions that were asked, but not answered by the respondent are marked with ".e". All questions never seen by the respondent (or any dirty data) are marked with ".a". The latter may mean that a respondent did not view the question because s/he skipped over it; or alternatively that s/he never reached that question in the survey due to a survey break off. If a respondent did not complete a survey, the variables representing survey end date and time are marked with ".c". Household member variables are marked with ".m" if the respondent has less household members (e.g. if the number of household members is 2, any variables for household member 3 and up are marked with ".m". Data files are also cleaned to match the standard variables listed here. The mapping from the cleaned up data variables to the raw data can be found here.
  • Weights are available for completed surveys and are part of the data file. Weights for a survey that is ongoing are by request (uas-weights-l@mymaillists.usc.edu). The following weight variables are included (detailed information on the UAS weighing procedures can be found here):
    • Base_weight: Relative base weights correcting for the over-representation of Native Americans in the survey sample. They average to one and sum to the UAS survey sample size.
    • Rel_weight: Relative post-stratification weights which ensure representativeness of the survey sample with respect to key selected variables (raking factors). They include the correction for the over-representation of Native Americans. They average to one and sum to the UAS survey sample size.
    • Imputation_flag: A binary variable indicating whether any of the variables used within the weighting procedure has been imputed.
  • Basic demographics of respondents (as collected in the My Household survey) who did not start the survey (in STATA or CSV format). Non-response data files are cleaned to match the standard variables listed here. The mapping from raw to cleaned up data variables is provided here. Note: on occasion only a respondent identifier will be available. This is the case when the respondent was invited to the survey, but had not done the My Household survey prior to being invited.
  • Timings file containing three types of information for the time respondents spent:
    • Listing of the time spent per respondent for the interview as a whole (NOTE: any time spent on a screen longer than 5 minutes is excluded from this listing):
      • Uasid of the respondent.
      • Total time spent in seconds.
      • Total time spent in minutes.
      • Number of screens viewed by the respondent.
      • Average time spent per screen in seconds.
    • Listing of the time spent per question by all respondents combined (NOTE: any time spent on a screen longer than 5 minutes is excluded from this listing):
      • Total number of times.
      • Average time spent in seconds.
      • Average time spent in minutes.
    • Listing of each period of time spent per question by each individual respondent (NOTE: any time spent on a screen longer than 5 minutes is included in this listing. As such, the number of times a question is listed here may slightly differ from the total number of times a question is reported to have been shown in the listing of the time spent per question by all respondents combined):
      • Uasid of the respondent.
      • Time spent in seconds.
  • Platform information file containing information about the type of device, operating system and browser used by the respondent. Please note that these are based on the user agent strings as reported by respondent's browsers. Such browser-reported user agent strings are known to be limited in their accuracy, and as such, the provided platform information should be treated as indicative in nature (rather than as absolute). More information on browser user agent strings and their limitations can be found here.
  • Codebook (PDF) describing the questions and skip logic of the survey. If a codebook is not available yet, a link to view the routing in the browser will be available instead.

Online only (provided real-time unless otherwise mentioned):

  • Response rate (gross sample size, number of people started, number of people completed, number of people not started).
  • Graphical timing distribution of the time spent by respondents on the survey.
  • Tabular and graphical representation of the data for individual variables (NOTE: generated nightly).
  • Survey information:
    • Author(s)
    • Sample selection
    • Respondent compensation
    • Field dates
    • Average time spent in minutes (NOTE: generated nightly)
    • Question details. NOTE: the standard demographics described here differ from the raw variables in the question listings (in name and/or answer options). The mapping from cleaned data variables to the raw data can be found here.

In addition to the above, the following can be provided on request:

  • Item non response summaries.
  • Custom extracts from the survey logs; for example, how often respondents went back and forth for a particular question.

Lastly, users can combine data from different studies so that a wealth of additional information can accessed at no additional charge (see for example the survey topics listed here).