Job cuts at Microsoft
A number of people have forwarded various links to me today on the subject of…
If you have Office 2013 Professional (or Pro Plus, which is part of certain Office 365 offerings), you’ll be entitled to use a set of business intelligence add-ins for Excel 2013 which I refer to as the Power Tools. The full set consists of:
I describe Power Query as the part of the process that comes before the modelling, analysis and presentation provided by Power Pivot and Power View. But first, a question… why use Excel for business intelligence?
Okay, back to Power Query… as I said, it often comes first in the BI process, and that’s because it can be used to obtain data, filter it and generally tidy it up before moving on to the modelling, analysis and presentation stages. It’s very easy to get side-tracked by the fun side of Power Query (really, it can be fun) and we’ll get onto that, but let’s look at the basics. Firstly, Power Query allows you to grab data from a source:
Once you have secured your data, you can clean it up – this may involve filtering to get down to a subset of records, removing unwanted columns, adding custom columns, and transforming text. You can also append the data to an existing table or merge with another data set (example coming up). Click on the image on the right to see the full query editor.
Obviously there are levels of complexity to getting to the data – grabbing data from another Excel file is pretty simple. If you’re grabbing data from a relational database you probably need to know something about the databases and tables. Having said that, I was able to query UK cash point (ATM) data from the Azure Marketplace and filter it to the Guildford postcode area very quickly. Once the querying and clean-up is complete you’re left with an Excel table – this can be analysed further (e.g. using a pivot table) or added to a data model for use in Power Pivot.
I mentioned the fun part, and here it is. An online search will look for tables of data related to your search term. Most of the results seem to be Wikipedia pages, and indeed there is some clever API stuff going on to directly extract a table from a Wikipedia page. Depending on the search term you’ll also see results from data.gov. Using the online search it’s incredibly quick and easy to pull in data on just about any subject – the first timed I saw this demoed the presenter asked someone in the audience to shout out the name of a movie star (Brad Pitt, a rather obvious choice) and within seconds we were looking at a table of Mr Pitt’s filmography.
My favourite demo goes like this… I search on ‘premier league’ and select the Wikipedia page for ‘Managers – Premier League’. In comes a table listing the managers, in order of their appointment (it’s out of date at the moment, because no-one has updated the page since Villas-Boas, Jol, Mackay and Clarke were shown the door, and it doesn’t list Tony Pulis at Crystal Palace). Remove the first column which is blank, we don’t need that. Then I perform another search, this time for ‘premier league stadia’, select the 2013-14 page, and we have a second table listing the clubs’ stadium names and capacities.
Now I can select the Merge option – we have the two tables available (created by the earlier queries) and it’s just a matter of selecting the columns they have in common (one is Team, the other is Club, but that doesn’t matter). The penultimate step is to select which columns from one table we want added to the other (we don’t need the club name twice so the second instance doesn’t need to be included). A final bit of tidying up (renaming columns, swapping their positions, maybe changing the sort order) and we’re done. After clicking on ‘Apply and Close’ we have a table which contains the Premier League clubs, details on their managers and their stadia. Very quick and simple.
You can download the Power Query preview from here.