Microsoft Powerset

July 3, 2008

I wonder if the tools Microsoft acquired via the Powerset acquisition will be used in not only their consumer live search environment but also to create new ways of looking at data in their enterprise search tools.

It seems to me that using Powerset to semantically slice, dice and present additional facetted data on a corporate information stack could be very useful indeed.

Some form of integration with SharePoint maybe…

Today an interesting startup came out of stealth, (no not mine) and launched a very interesting sounding product.  

The startup is Pluribo and their product is designed to help users rapidly get to the gist of product reviews by aggregating and summarizing the available reviews data and then generating small descriptive sentence(s).

This is the kind of thing I think of (and have mentioned) when I speak of directed Text Analytics, Semantic Learning or Search.  Text analytics is core to their product but they are using text analytics not as a product in and of itself but as an ends to a means.  We will see more and more companies (I am one) producing very specific vertical products based around these larger horizontal technology buckets.  It is unimportant how they do their summarization and what algorithms are chosen to do it…  All that matters is the outcome.

This is another great example of utilizing large quantities of data with a bit of semantics sprinkled in to provide the end user with an answer.

Pluribo says they have some patents pending on this stuff… The summarization space is quite well researched and there is a TON of prior art out there so I hope they are not pinning success on a patent. That said I am totally IMPRESSED with the sentence generation system; this is a very hard thing to do right and from the brief look I have taken they are doing a great job.  It appears that they do not only pull summary type data out of the aggregated reviews but figure out the sentiment that is most relevant and with this in mind generate the summary sentence with appropriate words.

Pluribo has created a great product I thought of building once upon a time before I went the product route I am going and it’s also one I will use a lot and I bet others will too.

I hope they stay focused and refine the product even more as I think it’s a winner and not because it’s a directed application of text analytics but because it’s darn useful.

Cheers Guys! 🙂

First what do I mean by actionable information?

It is information that comes from a trusted source, is about something that’s important to you, and that, once known compells you to an take action.

So how do we aqcuire actionable information from the fire hose of data we see every day?  Well while it often appears that the volume of information we have flowing through our various data pipes is inherently at odds with the ability for us to make it actionable I think this misses one key point, the fact that the inherent volume of similar data can actually be seen as a positive if the right “mining” tools are in place.

This flow of information more often than not has patterns that can be extracted statistically, syntactically or semantically or using some combination, ie. using syntactic and semantic features to improve statistical recognition.

A good overview of leveraging data using statistics can be found in a talk given by Peter Norvig at Paul Graham’s latest startup school.