lately I had to prepare for a test, so I avoided meetups and conferences and to write in this blog, but I am still here, and today I attended a meetup that was somehow nice, but maybe not exactly well presented. The speaker was Mark Wilcock (LinkedIN), I unfortunately arrived 5 minutes late so I lost the very beginning.
The context was this: let’s say that a client has lot of data, and wants you to create a tool for visualise it. The first tool he spoke about is a language, R, defined in its website as “a language and environment for statistical computing and graphics“.
But then you have to show graphs and aggregate data. He made an example of a simple web application for doing so, that requires a couple of months for being developed (from the project definition to the development… I would say that in my previous company for doing something like that we took much more than 3 years).
At this point he introduced Tableau, a software that, for what he said, “can do a lot in just a couple of hours of usage” even if it is the first time you use it. Table is not free, but he said that the license, being based on the usage (you pay a monthly fee), is quite cheap.
Then he wanted to show us how to extract data from text: he took the Corporate Responsibility Reports of a large bank from 2010 to 2017 and he tried to analyse the usage of words in it. He did it using directly R by splitting the text into words, he was in this way able to analyse how some words had very different occurrences in different years.
He then moved on by analysing the recent results of the strength of EU banks from the European Banking Authority. This is a big quantity of data that let him introduce the discussion on PowerBI. This tool is used only to visualise your data, but it has several nice features, the two underlined by Mark were: the large amount of datasources from which it can retrieve data (included R functions) and the possibility to “publish” data and graphs on internet to make them available “on the cloud” by laptops and tablets.
He then spent two words on Excel, by just saying that it is a good tool for per-pixel table visualisation (meaning that you can easily create tables with dimensions/width that you can change at will).
For machine learning, he presented Azure by comparing its visual editor with the complexity of doing machine learning with a programming language (he used again the language R). Then he did a comment on the pros of using cloud-based tools in general (availability, no need of servers and setups…).
Last but not least, he came back to the analysis of text by using a Cognitive Services Text Analytics API and he got a better underlining of words and related words than the one he achieved by using only the R language.
So, at the end, in one hour Mark Wilcock presented some very interesting tools for BI and machine learning in an easy but not satisfying way.