As a freelance media measurement specialist I do not have the benefit of a Google style 20% worktime to work on my own projects. But if I did, and it is my aim to make more time available, I would like to spend more time researching the developmental understandings surrounding big data. Big data involves the collection, curation, and analysis of mass data. From multiple sensors spread around an aeroplane to CCV, cheap storage and analysis tools are making it possible for pioneers to glean new understandings. It often involves the combining of servers using language structures like Apache and Hadoop.
A couple of client requests had got me to be thinking about its relevance. At first they all seemed pretty simple…’what sort of language are people using around my brand’. It kinda seemed like a brand mapping request, but just on their media coverage. They wanted to get a feel of the ‘love’ or ‘hate’ around the brand; what recurring signals had velocity…something a bit more than a tag cloud.
I used a couple of media collection and analysis tools but have been finding their quantitative/commoditised approach falling short. I also used a few language analysis tools but found them keen to highlight the nouns when I was much more interested in the verbs…the ‘doing’ words.
I plugged a corpus of the most likely ones in and tracked their usage, which was okay but left me feeling my reactive approach was missing something. Then I found this post on tools to extract ideas from unstructured text. In effect, this is a work in progress; I have tried a couple of the tools mentioned but my computing language skill are relatively limited and the results a bit thin so far.
Leave a Reply