Exploring survey data with IBM Many Eyes

IBM Many Eyes is a free tool that makes it easy to create quick visualisations of open datasets. In this recipe we explore how to use Many Eyes to analyse data from a survey, focussing on Pie Charts; Bubble Charts and Matrix Plots for looking at one, two and three different variables (questions and answers from the survey) at once.

You will Need

  • Data to visualise - for this recipe, survey data with multiple-choice answers to a range of questions
  • Internet access

Step by Step

Step 1 : Prepare Your Data

IBM Many Eyes likes CSV or Excel tables of data, pasted in from the clipboard (copy and paste). The first row of your dataset needs to have column headings. If you have a column that Many Eyes should treat like a date, or a number, make sure it is always a number or a date in that column (for example, if you have NA when no number is available just replace this with 0 or simple leave the the cell blank). When Many Eyes builds a chart it needs to count up the values from a numerical column in your dataset to work out what size to display bars and pie chart slides etc. To make sure it has a column to work with, in Excel or your chosen spreadsheet, add a column right at the start of the spreadsheet. Label the header cell 'Count', and put a 1 in every other cell in the column.

Check your data carefully and remove any columns with private or personal data in - as when you use Many Eyes your data will be published on the web.

Example

We were working with data from a Survey Monkey online survey. Survey Monkey's Excel export of results includes the headings across the first two rows of the data, so we had to tidy this up - putting a clear heading for each column into the first row, and then removing the second. To make our life easier, we have some of the columns short headings (instead of the full question that was asked in the survey). We removed any columns with personal information, such as the IP address of the survey respondent and e-mail addresses. We added a 'Count' column to the start of the sheet.

Step 2 : Sign Up and Upload Data

You will need an account to add your own data to Many Eyes (though you can experiment with the visualisation steps without signing up - just look at the list of public datasets to find one you want to explore). Once you have signed-up and logged in, you will need to go to Upload a Dataset where you will find a large input box. Highlight all the data in your spreadsheet table, including the headers row, and copy this to the clipboard. Go back to the Many Eyes site and paste the data into the input box. If you are working with a really big dataset this might take some time.

Many Eyes will try to work out the structure of your data and will show a preview further down the page. Check this looks OK (it should be showing the data as a table), and if so, enter some background information about your data in the relevant boxes before creating your dataset on Many Eyes.

We found in testing that copying and pasting from a Google Spreadsheet didn't work very well, but your experience may vary. If you find Many Eyes isn't detecting the structure of your data and you have an alternative spreadsheet programme available to copy from you might want to try that.

Step 3 : One Dimensional Visualisation

You can now access your dataset from the 'My Contributions' menu. Load the page for your dataset. You might want to bookmark this page or share the link with others. Under the preview table of your data you will see an option to “Visualise” the data. Click that link.

Many Eyes has a range of different visualisations - all suited to different sorts of data. The sort of multiple-choice question survey data we are dealing with works best with Pie Chart; Bubble Chart and Matrix Chart visualisations. (The text visualisations work best with a dataset of free text, and the TreeMap or Geographical visualisations need data to be specially prepared with numerical columns or geographical codes in).

To start, pick the Pie Chart visualisation. This allows you to see, for any question in your dataset, how the answers were distributed. Many Eyes runs using Java Applets: you might need to give it permission to run or wait a few moments for it to load up the Pie Chart Visualisation interface.

Make sure your 'Count' column is selected in the 'Slice Size; drop-down box, and select the question you want to see result for from the 'Label' drop-down list.

Click on the pie-chart to get counts of how many responses are in each category.

Be carefully to check the key whenever you choose a new question: Many Eyes often changes the colours it uses for each question - so just because blue was 'Yes' or 'No' for one question, doesn't mean it will be next time.

Step 4 : Two Dimensional Visualisation

The Bubble Chart lets you look at two variables (questions) from your dataset at once. One of the variable is used to set the size of bubbles, and then each bubble is divided by the second variable into colour pie-chart slices.

Again, make sure the 'Count' column is selected for the 'Bubble Size' and then, for Label, choose the variable/question you want to use to show the size of the bubbles. For colour choose the variable/question you want used to make each bubble into a pie-chart.

Presuming your variables have a reasonable number of response categories (perhaps between 3 and 7 different possible responses for each question) you should be able to get a visual sense of how responses to the second question varied based on answers to the first.

Step 5 : Three Dimensions

The Matrix Chart lets you explore three variables at once. By default it will use bubbles again, although you can switch it to use bar-charts instead from the control panel on the left of the visualisation. You use one variable to put the values in columns, one to put them in rows, and one to turn each cell in the table this creates into a stacked bar chart or a pie-chart. Once you have chosen your variables you can read across the matrix to find the patterns of interest to you.

For example, in the Matrix Chart shown here, we are exploring how the age of young people, and the area they live in, affects the mode of transport they choose for getting to their local youth club. All three variable are shown on the one chart for us to explore.

Step 6: Embed It All

One of the big plusses of Many Eyes is that you can take your interactive visualisation and embed it in your own website. Below your chart (when you are logged in), you will see the option to give the visualisation a title, tags and a description, and then to publish it.

Once it is published (or on anyone else's published visualisation) you can use the 'Share' link below the chart to get an interactive chart to embed in your own website, or an image copy of the chart.

If you just need a quick image of your visualisation then you can take a screen-shot of it. If you don't know how to make a screen shot on your computer (you don't need any special software) then search Google for 'create screen shot' and the name of your operating system (e.g. 'create screenshot mac osx' or 'create screenshow Windows XP')

Health Warning

Is it public data?

Any data you upload to IBM Many Eyes is immediately available on the public Internet - and is included in the Many Eyes list of recent datasets. Make sure you only share open data, and you are careful to never upload any personal data to Many Eyes.

Check your stats

Any survey or statistical data will have limitations. Particular things to watch out for:

  • Sample Size: have you got enough responses to be able to draw generalisation from them? For example, 75% of people might prefer your service to be open at weekends, but if that 75% is 3 people out of just 4 you asked, you're not likely to be able to draw any strong conclusions. Even if your dataset starts wit lots of rows, when you carry out multidimensional analysis you might end up with some fairly small groups of people at different points. Statisticians have advanced methods for determining valid sample sizes which you can explore if you need to draw strong conclusions.
  • Bias - beware for bias in the dataset, both in how questions in a survey were asked, or in who the data was collected from.

Examples and variations

Add links to any examples of this recipe, or add notes on possible variations.

<wysiwyg exploring_survey_data_with_many_eyes />

recipe/exploring_survey_data_with_many_eyes.txt · Last modified: 2011/02/17 23:16 by Tim Davies
You are here: startrecipeexploring_survey_data_with_many_eyes
CC Attribution-Share Alike 3.0 Unported Driven by DokuWiki Recent changes RSS feed Valid XHTML 1.0