Hound styles itself as Siri or Cortana with a PhD

2 Jun 2015 | Author: | No comments yet »

Hound styles itself as Siri or Cortana with a PhD.

SoundHound, the free app with 260 million downloads that listens to music or humming and tells you what’s playing, debuts a new app Tuesday in private Android-only beta called Hound. A full consumer launch comes this summer, and an iOS version will follow as well. “What is the population and sizes of India and China and what are the capitals and what are the country codes for Italy and France?” SoundHound founder and CEO Keyvan Mohajer asks his phone, which a split-second later reels off every answer correctly. He smiles. “We have been working on this for nine years.” Hound combines voice recognition with natural language understanding in a way that allows users to not only ask questions more naturally, but also ask follow-up queries without repeating the original question. Mohajer wanted to build a way for people to talk to their computers and had plunged into an electrical engineering PhD program at Stanford to study voice recognition and natural language processing. “I went to some investors and told them, ‘In 10 to 15 years we’re going to be talking to our computers, so I want to start a company to do that, and I want to raise money,’” Mohajer, 37, recalled to FORBES.

It’s called Hound, and it’s a voice-controlled interface SoundHound hopes you’ll soon see inside every phone, tablet, car, toaster, and espresso machine on the planet. When I say “I ran into Jill and we talked about how Carly is surviving first grade,” your brain doesn’t imagine that Jill and I physically collided, nor that Carly’s life is actually in danger, and you probably assume that Carly is both a first grade student and related to Jill. Mohajer and company vice president Katie McMahon recently previewed Hound for USA TODAY, highlighting a partnership they engineered with travel site Expedia, whose mission is to turn your smartphone into a travel agent. You can say “I need a hotel three Thursdays from now near Fisherman’s Wharf for less than $200 and it needs to have a pool and free WiFi,” and Hound will parse your data and find exactly what you’re looking for. He and three others boarded themselves up in a dorm apartment in Escondido Village on Stanford campus for two weeks, surrounded by the heat and fan noise of twenty computers.

When Mohajer asks Hound for “pet-friendly hotels near the Golden Gate Bridge with three stars or more stars under $200 excluding bed and breakfasts,” within seconds the screen reveals a breakdown of hotels meeting those specific criteria. During a demo, SoundHound founder Keyvan Mohajer spoke a series of ever-escalating commands and questions concerning international locations into his Nexus 5, eventually forcing the Android app to spit out the populations and capitals for three different countries all at once—the app delivered what he was looking for, no processing time needed. “The only limit,” Mohajer says, “is my breath.” It’s part research tool, part personal assistant, part hands-free system. For example, ask one of the voice recognition platforms you currently use to “show restaurants that aren’t Chinese food” and the platform will latch onto keywords and list Chinese restaurants. The most impressive part, though, is its speed: Rather than collect, synthesize, send, and then process your input (which is how most speech-recognition systems work), Hound does all those things simultaneously.

SoundHound, which is similar to Shazam but can also recognize songs that are hummed or sung, makes money through advertising and referral fees to stores like iTunes. It’s like Google’s Instant Search with your voice, searching and re-searching as you talk, and it feels so much better than Siri’s endless churning. Now, nearly a decade after that pitch to investors, Mohajer’s original vision is here in the form of Hound, a voice search app that can handle incredibly complex questions and spit out answers with uncanny speed. Leave out context and add criteria, it doesn’t matter: I saw a demonstration of Hound handling the dizzying request “Give me a hotel room that’s more than $300 but less than $400, has WiFi, has air conditioning, picks me up from the airport, and don’t show me rooms that don’t have air conditioning” —it gave a list of accurate results within a few seconds.

Houndify everything.” Hound’s accompanying developer platform, Houndify, is aimed at getting the mushrooming number of connected devices – 25 billion by 2020, according to Gartner – to speak Hound. “Voice will never replace touch, but it will be coming to almost every product we own,” says Mohajer. “Imagine telling your coffee maker, ‘Make me a double latte with light foam,’ just like you would a real barista?” Hound features dozens of category domains, including navigation, weather, stocks and geography. Mohajer says he’s particularly interested in recipes; he wants you to be able to say “Okay Hound, I have a stick of celery, some chicken broth, and a sausage,” and have a recipe returned. It processed long strings of voice commands as soon as they were spoken, which meant it was able to turn immediately to answering the question, unlike other programs that first turn spoken words into written words then parse the written words for meaning. Mohajer, who has a PhD in electrical engineering and speech recognition from Stanford University, says he wanted to work on human-machine communication after grad school, but the notion was a sci-fi no-go for VCs.

Through the “Houndify” system, developers can integrate Hound technology and voice control into their own apps, and they can also plug their data and APIs into its interface. Competitors like Siri, Cortana and Google Now come from giants or from startups that were brought in-house and developed further under the wing of established tech companies. Now that Apple, Google, Microsoft and others are deep into the search for speech recognition that borders on artificial intelligence – not to mention the fact that Hollywood has even taken the topic for a spin in the futuristic movie Her – Mohajer feels more than vindicated. “It’s here, and it’s only going to get better.

But here, a robotic voice instantly replied, “The population of Washington, DC is 601,723.” There were two Washingtons there, and it got the right one. Now, my other prediction is that we’ll see a hotel on the moon.” He smiles. “Who knows, thanks to (SpaceX and Tesla founder) Elon Musk, maybe I’ll be right about that, too.”

In another test, he asked, “How many days are there between the day after tomorrow and three days before the second Thursday of November in 2022?” The app nailed it again. In our demo, which contained several dozen scripted questions but also some impromptu ones, the words coming out of Mohajer’s mouth popped up on screen nearly as fast as he was saying them, and Hound would pipe back with an answer faster than seemed possible. SoundHound isn’t terribly worried about possible Hound competition from other startups—after all, SoundHound spent years and years developing their nuanced tech, so a startup with 6 months of work in language processing can’t hope to compete, says Mohajer. Now that the product has speed, it’s clearly treasured: When Mohajer demoed Hound’s ability to do hotel search through a partnership with Expedia Expedia, he apologized in advance for the several-second delay that comes from pinging Expedia for search results. “This is the only part of the demo that’s not fast, and it’s not us,” he said. That said, our test also took place over Wi-Fi, and in a perfectly quiet room, making it impossible to tell whether Hound maintains these speeds in the real world.

In the years since SoundHound beta launched in 2007, Mohajer has been pushing his team to shave down the processing time, sometimes even by 1%, in order to improve the user experience. Instead of relying on “entity detection,” where voice software looks for key words like “Chinese” and “restaurant” and tries to guess what the search is for (restaurants with Chinese food), it takes in the entire phrase. SoundCloud hopes that getting developers to use Houndify and getting Hound’s voice recognition tech into cars or video games will help solidify Hound’s future.

From the outset, Hound will have about 50 domains, or services it’s tying into through APIs; things like currency converters, news sites, flight status information, and navigation. Hound currently has close to 50 domains — like navigation, local search, weather, stocks, news, or flight status — and hopes that by opening its platform to developers that it will quickly add more.

Mohajer says the plan is to ramp that up into the millions. “Siri launched with 10 domains, and three years later it’s at about 22 new domains, so it takes a long time,” he says. In the future, Mohajer imagines, voice control won’t replace touch or typing but will be another appealing option, especially in the growing internet of things sector, where more devices will be controllable but might not have screen-based interfaces.

Sometimes that’s just fine, but in similar tools like Siri and Cortana, web results are a sign the system couldn’t keep up with what you’re asking of it. Mohajer contends that by kicking people to web results, nobody ends up feeling disappointed, though I’d argue that if it happens enough you’ll just stop using the app entirely and forget about it.

It’s worth noting that Hound is arriving at a time when Google and Apple are stepping up efforts to add context to the things people are looking at on their phones, often using voice interfaces, which could almost entirely remove the usefulness of Hound for simple searches. Last week, Google unveiled Now on Tap as part of its upcoming Android M release, a feature that brings its Now service inside of every app and gives the company an incredible amount of context for why you’re looking for something. Apple is also rumored to be working on a feature called Proactive that attempts to put relevant apps and information in front of users without them having to search for it in the first place. That hurdle of having to find and launch Hound could change if app developers build the voice search into their apps, or if SoundHound and its technology get snapped up by one of these larger players.

I use Google Maps on iOS instead of Apple Maps, even though Apple Maps is more integrated,” he says. “I think if you deliver something that is substantially better, people will use it.”

Here you can write a commentary on the recording "Hound styles itself as Siri or Cortana with a PhD".

* Required fields
Our partners
Follow us
Contact us
Our contacts


ICQ: 423360519

About this site