Sunday, May 16, 2010

The Intelligence of Social Media (Part 1)

According to Wikipedia, “social media is online content created for people using highly accessible and scalable publishing technologies.” These days, networking is very different than it was in the past. A lot of social media services like Twitter, Facebook, LinkedIn, personal blogs, wikis, podcasts, and other types of media content generate big volumes of data. But more importantly, people contribute to the creation of this data by chatting, expressing ideas, or making personal and business relations online. They also contribute to the way social media information is organized and published on the Web. Today, these massive volumes of data are the objects of study and analysis. In a sense, there is already an effort to measure the quantitative and qualitative aspects of this kind of data.

The main issue here is that the information generated by social media is created by “common people” who affect trends, social tendencies, and business by influencing the popularity of demand for products. According to the Internet Usage Statistics chart, there are more than 1,668,870,408 Internet users worldwide. We can’t deny the fact that the social media phenomenon are huge for business and global markets, and it has to be measured, analyzed, and considered with a business intelligence ( BI) perspective. Many people will discuss topics (e.g., products, services, and companies) and express their opinions about them (good or bad). But how is this data going to be measured?

Companies can measure and analyze data generated on their sites using Web analytics tools like Google Analytics. But there is a bigger universe to explore, and a lot more information to deal with outside a company’s own corporate Web site. But how does a company measure and analyze data generated outside its own controlled space if it doesn’t have control over users? The answer is simple: interpret what people are expressing directly on the Web.

There are important advances already made in order to capture, measure, analyze, and understand what people are saying about a company’s products, services, etc. These advances are based on the concept that social media content was made by adding “metadata” as keywords. This metadata enables text to be categorized and shared through the Web in a process called “collaborative tagging.” Collaborative tagging has enabled people’s sentiments to be measured. The Structure of Collaborative Tagging Systems by Scott A. Golder and Bernard A. Huberman, provides a very interesting and accurate tagging classification system that helps clarify what sentiment analysis is based on. According to Golder and Huberman, tags can be useful to

• identify a theme;
• identify a subject;
• identify content ownership;
• refine categories;
• identify qualities and characteristics;
• self reference; and
• organize tasks.

These basic conclusions can support the idea that it is possible to measure and find regular patterns in Web content. This issue is by no means easy to address and the idea is to use “sentiment analysis” or “text mining.” The basic concept behind a sentiment analysis is to measure the polarity of opinion—positive, negative, or neutral—regarding a subject, a product, a service, etc. This is also known as opinion mining or opinion extraction, which helps extract, process, and analyze text in order to discover what people are expressing and thinking.

Currently, some companies are making significant efforts to perform sentiment analysis. Some of these organizations are not directly linked to the BI space, but already have tools important to performing these tasks. It will be interesting to know what traditional BI companies are doing to address this emerging functionality.

In part two of this blog, I will discuss the sentiment analysis definition further, and I will analyze what some vendors are doing regarding the sentiment analysis space.

I welcome your thoughts—leave a comment below, and I’ll respond as soon as I can.

0 comments:

Post a Comment