These guys have found a way to identify a seed group that, when infected, can spread a message across an entire network. And they say it can be done quickly and easily, even on relatively large networks.
Their method is relatively straightforward. It is based on the idea that an individual will eventually receive a message if a certain proportion of his or her friends already have that message. This proportion is a critical threshold and is crucial in their approach.
Having determined the threshold, these guys examine the network and look for all those individuals who have more friends than this critical number. They then remove those who exceed the threshold by the largest amount.
In the next, step, they repeat this process, looking for all those with more friends than the critical threshold and pruning those with the greatest excess. And so on.
This process finishes when there is nobody left in the network who has more friends than the threshold. When this happens, whoever is left is the seed group. A message sent to each member of this group can and should spread to the entire network.
That’s a slick approach to a well-known problem. What’s got network scientists bogged down in the past is that they’ve always couched this conundrum in terms of finding the smallest seed group. Proving that the group you’ve found is the smallest really is a tricky problem.
But Shakarian and co make no such claim. “We present a method guaranteed to find a set of nodes that causes the entire population to activate – but is not necessarily of minimal size,” they say.
That suddenly makes the problem much easier. Indeed, these guys have tested it on a large number of networks to see how well it works. Their test networks include Flickr, FourSquare, Frienster, Last.FM, Digg (from Dec 2010), Yelp, YouTube and so on.
And the algorithm works well. “On a Friendster social network consisting of 5.6 million nodes and 28 million edges we found a seed set in under 3.6 hours,” say Shakarian and co. For this they used an Intel X5677 Xeon processor operating at 3.46 GHz with a 12 MB Cache running Red Hat Enterprise Linux version 6.1 and equipped with 70 GB of physical memory.