machine learning

This bipedal robot has a flying head

Posted by | Gadgets, machine learning, mobile robot, robot, robotics, TC, university of tokyo | No Comments

Making a bipedal robot is hard. You have to make sure maintain exquisite balance at all times and, even with the amazing things Atlas can do, there is still a chance that your crazy robot will fall over and bop its electronic head. But what if that head is a quadcopter?

University of Tokyo have done just that with their wild Aerial-Biped. The robot isn’t completely bipedal but it’s designed instead to act like a bipedal robot without the tricky issue of being truly bipedal. Think of the these legs as more a sort of fun bit of puppetry that mimics walking but doesn’t really walk.

“The goal is to develop a robot that has the ability to display the appearance of bipedal walking with dynamic mobility, and to provide a new visual experience. The robot enables walking motion with very slender legs like those of a flamingo without impairing dynamic mobility. This approach enables casual users to choreograph biped robot walking without expertise. In addition, it is much cheaper compared to a conventional bipedal walking robot,” the team told IEEE.

The robot is similar to the bizarre-looking Ballu, a blimp robot with a floating head and spindly legs. The new robot learned how to walk convincingly through machine learning, a feat that gives it a realistic gait even though it is really an aerial system. It’s definitely a clever little project and could be interesting at a theme park or in an environment where a massive bipedal robot falling over on someone might be discouraged.

Powered by WPeMatico

Autonomous drones could herd birds away from airports

Posted by | artificial intelligence, drones, Gadgets, machine learning, TC | No Comments

Bird strikes on aircraft may be rare, but not so rare that airports shouldn’t take precautions against them. But keeping birds away is a difficult proposition: How do you control the behavior of flocks of dozens or hundreds of birds? Perhaps with a drone that autonomously picks the best path to do so, like this one developed by CalTech researchers.

Right now airports may use manually piloted drones, which are expensive and of course limited by the number of qualified pilots, or trained falcons — which as you might guess is a similarly difficult method to scale.

Soon-Jo Chung at CalTech became interested in the field after seeing the near-disaster in 2009 when US Airways 1549 nearly crashed due to a bird strike but was guided to a comparatively safe landing in the Hudson.

“It made me think that next time might not have such a happy ending,” he said in a CalTech news release. “So I started looking into ways to protect airspace from birds by leveraging my research areas in autonomy and robotics.”

A drone seems like an obvious solution — put it in the air and send those geese packing. But predicting and reliably influencing the behavior of a flock is no simple matter.

“You have to be very careful in how you position your drone. If it’s too far away, it won’t move the flock. And if it gets too close, you risk scattering the flock and making it completely uncontrollable,” Chung said.

The team studied models of how groups of animals move and affect one another and arrived at their own that described how birds move in response to threats. From this can be derived the flight path a drone should follow that will cause the birds to swing aside in the desired direction but not panic and scatter.

Armed with this new software, drones were deployed in several spaces with instructions to deter birds from entering a given protected area. As you can see below (an excerpt from this video), it seems to have worked:

More experimentation is necessary, of course, to tune the model and get the system to a state that is reliable and works with various sizes of flocks, bird airspeeds, and so on. But it’s not hard to imagine this as a standard system for locking down airspace: a dozen or so drones informed by precision radar could protect quite a large area.

The team’s results are published in IEEE Transactions on Robotics.

Powered by WPeMatico

Your next summer DIY project is an AI-powered doodle camera

Posted by | artificial intelligence, Gadgets, machine learning, Polaroid, TC | No Comments

With long summer evenings comes the perfect opportunity to dust off your old boxes of circuits and wires and start to build something. If you’re short on inspiration, you might be interested in artist and engineer Dan Macnish’s how-to guide on building an AI-powered doodle camera using a thermal printer, Raspberry pi, a dash of Python and Google’s Quick Draw data set.

“Playing with neural networks for object recognition one day, I wondered if I could take the concept of a Polaroid one step further, and ask the camera to re-interpret the image, printing out a cartoon instead of a faithful photograph.” Macnish wrote on his blog about the project, called Draw This.

To make this work, Macnish drew on Google’s object recognition neural network and the data set created for the game Google Quick, Draw! Tying the two systems together with some python code, Macnish was able to have his creation recognize real images and print out the best corresponding doodle in the Quick, Draw! data set

But since output doodles are limited to the data set, there can be some discrepancy between what the camera “sees” and what it generates for the photo.

“You point and shoot – and out pops a cartoon; the camera’s best interpretation of what it saw,” Macnish writes. “The result is always a surprise. A food selfie of a healthy salad might turn into an enormous hot dog.”

If you want to give this a go for yourself, Macnish has uploaded the instructions and code needed to build this project on GitHub.

Powered by WPeMatico

Original Stitch’s new Bodygram will measure your body

Posted by | Android, Apps, bra size, ceo, Clothing, machine learning, Mobile, online shopping, original stitch, sewing, shapescale, shirt, Startups, TC | No Comments

After years of teasing, Original Stitch has officially launched their Bodygram service and will be rolling it out this summer. The system can scan your body based on front and side photos and will create custom shirts with your precise measurements.

“Bodygram gives you full body measurements as accurate as taken by professional tailors from just two photos on your phone. Simply take a front photo and a side photo and upload to our cloud and you will receive a push notification within minutes when your Bodygram sizing report is ready,” said CEO Jin Koh. “In the sizing report you will find your full body measurements including neck, sleeve, shoulder, chest, waist, hip, etc. Bodygram is capable of producing sizing result within 99 percent accuracy compared to professional human tailors.”

The technology is a clever solution to the biggest problem in custom clothing: fit. While it’s great to find a service that will tailor your clothing based on your measurements, often these measurements are slightly off and can affect the cut of the shirt or pants. Right now, Koh said, his team offers free returns if the custom shirts don’t fit.

Further, the technology is brand new and avoids many of the pitfalls of the original body-scanning tech. For example, Bodygram doesn’t require you to get into a Spandex onesie like most systems do and it can capture 40 measurements with only two full-body photos.

“Bodygram is the first sizing technology that works on your phone capable of giving you highly accurate sizing result from just two photos with you wearing normal clothing on any background,” said Koh. “Legacy technologies on the market today require you to wear a very tight-fitting spandex suit, take 360 photos of you and require a plain background to work. Other technologies give you accuracy with five inches deviation in accuracy while Bodygram is the first technology to give you sub-one-inch accuracy. We are the first to use both computer vision and machine learning techniques to solve the problem of predicting your body shape underneath the clothes. Once we predicted your body shape we wrote our proprietary algorithm to calculate the circumferences and the length for each part of the body.”

Koh hopes the technology will reduce returns.

“It’s not uncommon to see clothing return rates reaching in the 40-50 percent range,” he said. “Apparel clothing sales is among the lowest penetration in online shopping.”

The system also can be used to measure your body over time in order to collect health and weight data as well as help other manufacturers produce products that fit you perfectly. The app will launch this summer on Android and iOS. The company will be licensing the technology to other providers that will be able to create custom fits based on just a few side and front photos. Sales at the company grew 175 percent this year and they now have 350,000 buyers that are already creating custom shirts.

A number of competitors are in this interesting space, most notably ShapeScale, a company that appeared at TechCrunch Disrupt and promised a full body scan using a robotic scale. This, however, is the first commercial use of standard photos to measure your appendages and thorax and it’s an impressive step forward in the world of custom clothing.

Powered by WPeMatico

Apple is rebuilding Maps from the ground up

Posted by | Apple, apple inc, Apple Maps, artificial intelligence, california, ceo, Chevron, computing, driver, eddy cue, Google, gps, iPad, iPhone, Japan, journalist, location services, mac pro, machine learning, Mobile, mobile devices, OpenStreetMap, San Francisco, satellite imagery, smartphones, Software, TC, technology, TomTom, United States, vp | No Comments

I’m not sure if you’re aware, but the launch of Apple Maps went poorly. After a rough first impression, an apology from the CEO, several years of patching holes with data partnerships and some glimmers of light with long-awaited transit directions and improvements in business, parking and place data, Apple Maps is still not where it needs to be to be considered a world-class service.

Maps needs fixing.

Apple, it turns out, is aware of this, so it’s re-building the maps part of Maps.

It’s doing this by using first-party data gathered by iPhones with a privacy-first methodology and its own fleet of cars packed with sensors and cameras. The new product will launch in San Francisco and the Bay Area with the next iOS 12 beta and will cover Northern California by fall.

Every version of iOS will get the updated maps eventually, and they will be more responsive to changes in roadways and construction, more visually rich depending on the specific context they’re viewed in and feature more detailed ground cover, foliage, pools, pedestrian pathways and more.

This is nothing less than a full re-set of Maps and it’s been four years in the making, which is when Apple began to develop its new data-gathering systems. Eventually, Apple will no longer rely on third-party data to provide the basis for its maps, which has been one of its major pitfalls from the beginning.

“Since we introduced this six years ago — we won’t rehash all the issues we’ve had when we introduced it — we’ve done a huge investment in getting the map up to par,” says Apple SVP Eddy Cue, who now owns Maps, in an interview last week. “When we launched, a lot of it was all about directions and getting to a certain place. Finding the place and getting directions to that place. We’ve done a huge investment of making millions of changes, adding millions of locations, updating the map and changing the map more frequently. All of those things over the past six years.”

But, Cue says, Apple has room to improve on the quality of Maps, something that most users would agree on, even with recent advancements.

“We wanted to take this to the next level,” says Cue. “We have been working on trying to create what we hope is going to be the best map app in the world, taking it to the next step. That is building all of our own map data from the ground up.”

In addition to Cue, I spoke to Apple VP Patrice Gautier and more than a dozen Apple Maps team members at its mapping headquarters in California this week about its efforts to re-build Maps, and to do it in a way that aligned with Apple’s very public stance on user privacy.

If, like me, you’re wondering whether Apple thought of building its own maps from scratch before it launched Maps, the answer is yes. At the time, there was a choice to be made about whether or not it wanted to be in the business of maps at all. Given that the future of mobile devices was becoming very clear, it knew that mapping would be at the core of nearly every aspect of its devices, from photos to directions to location services provided to apps. Decision made, Apple plowed ahead, building a product that relied on a patchwork of data from partners like TomTom, OpenStreetMap and other geo data brokers. The result was underwhelming.

Almost immediately after Apple launched Maps, it realized that it was going to need help and it signed on a bunch of additional data providers to fill the gaps in location, base map, point-of-interest and business data.

It wasn’t enough.

“We decided to do this just over four years ago. We said, ‘Where do we want to take Maps? What are the things that we want to do in Maps?’ We realized that, given what we wanted to do and where we wanted to take it, we needed to do this ourselves,” says Cue.

Because Maps are so core to so many functions, success wasn’t tied to just one function. Maps needed to be great at transit, driving and walking — but also as a utility used by apps for location services and other functions.

Cue says that Apple needed to own all of the data that goes into making a map, and to control it from a quality as well as a privacy perspective.

There’s also the matter of corrections, updates and changes entering a long loop of submission to validation to update when you’re dealing with external partners. The Maps team would have to be able to correct roads, pathways and other updating features in days or less, not months. Not to mention the potential competitive advantages it could gain from building and updating traffic data from hundreds of millions of iPhones, rather than relying on partner data.

Cue points to the proliferation of devices running iOS, now over a billion, as a deciding factor to shift its process.

“We felt like because the shift to devices had happened — building a map today in the way that we were traditionally doing it, the way that it was being done — we could improve things significantly, and improve them in different ways,” he says. “One is more accuracy. Two is being able to update the map faster based on the data and the things that we’re seeing, as opposed to driving again or getting the information where the customer’s proactively telling us. What if we could actually see it before all of those things?”

I query him on the rapidity of Maps updates, and whether this new map philosophy means faster changes for users.

“The truth is that Maps needs to be [updated more], and even are today,” says Cue. “We’ll be doing this even more with our new maps, [with] the ability to change the map in real time and often. We do that every day today. This is expanding us to allow us to do it across everything in the map. Today, there’s certain things that take longer to change.

“For example, a road network is something that takes a much longer time to change currently. In the new map infrastructure, we can change that relatively quickly. If a new road opens up, immediately we can see that and make that change very, very quickly around it. It’s much, much more rapid to do changes in the new map environment.”

So a new effort was created to begin generating its own base maps, the very lowest building block of any really good mapping system. After that, Apple would begin layering on living location data, high-resolution satellite imagery and brand new intensely high-resolution image data gathered from its ground cars until it had what it felt was a “best in class” mapping product.

There is only really one big company on earth that owns an entire map stack from the ground up: Google .

Apple knew it needed to be the other one. Enter the vans.

Apple vans spotted

Though the overall project started earlier, the first glimpse most folks had of Apple’s renewed efforts to build the best Maps product was the vans that started appearing on the roads in 2015 with “Apple Maps” signs on the side. Capped with sensors and cameras, these vans popped up in various cities and sparked rampant discussion and speculation.

The new Apple Maps will be the first time the data collected by these vans is actually used to construct and inform its maps. This is their coming out party.

Some people have commented that Apple’s rigs look more robust than the simple GPS + Camera arrangements on other mapping vehicles — going so far as to say they look more along the lines of something that could be used in autonomous vehicle training.

Apple isn’t commenting on autonomous vehicles, but there’s a reason the arrays look more advanced: they are.

Earlier this week I took a ride in one of the vans as it ran a sample route to gather the kind of data that would go into building the new maps. Here’s what’s inside.

In addition to a beefed-up GPS rig on the roof, four LiDAR arrays mounted at the corners and eight cameras shooting overlapping high-resolution images, there’s also the standard physical measuring tool attached to a rear wheel that allows for precise tracking of distance and image capture. In the rear there is a surprising lack of bulky equipment. Instead, it’s a straightforward Mac Pro bolted to the floor, attached to an array of solid state drives for storage. A single USB cable routes up to the dashboard where the actual mapping-capture software runs on an iPad.

While mapping, a driver…drives, while an operator takes care of the route, ensuring that a coverage area that has been assigned is fully driven, as well as monitoring image capture. Each drive captures thousands of images as well as a full point cloud (a 3D map of space defined by dots that represent surfaces) and GPS data. I later got to view the raw data presented in 3D and it absolutely looks like the quality of data you would need to begin training autonomous vehicles.

More on why Apple needs this level of data detail later.

When the images and data are captured, they are then encrypted on the fly and recorded on to the SSDs. Once full, the SSDs are pulled out, replaced and packed into a case, which is delivered to Apple’s data center, where a suite of software eliminates from the images private information like faces, license plates and other info. From the moment of capture to the moment they’re sanitized, they are encrypted with one key in the van and the other key in the data center. Technicians and software that are part of its mapping efforts down the pipeline from there never see unsanitized data.

This is just one element of Apple’s focus on the privacy of the data it is utilizing in New Maps.

Probe data and privacy

Throughout every conversation I have with any member of the team throughout the day, privacy is brought up, emphasized. This is obviously by design, as Apple wants to impress upon me as a journalist that it’s taking this very seriously indeed, but it doesn’t change the fact that it’s evidently built in from the ground up and I could not find a false note in any of the technical claims or the conversations I had.

Indeed, from the data security folks to the people whose job it is to actually make the maps work well, the constant refrain is that Apple does not feel that it is being held back in any way by not hoovering every piece of customer-rich data it can, storing and parsing it.

The consistent message is that the team feels it can deliver a high-quality navigation, location and mapping product without the directly personal data used by other platforms.

“We specifically don’t collect data, even from point A to point B,” notes Cue. “We collect data — when we do it — in an anonymous fashion, in subsections of the whole, so we couldn’t even say that there is a person that went from point A to point B. We’re collecting the segments of it. As you can imagine, that’s always been a key part of doing this. Honestly, we don’t think it buys us anything [to collect more]. We’re not losing any features or capabilities by doing this.”

The segments that he is referring to are sliced out of any given person’s navigation session. Neither the beginning or the end of any trip is ever transmitted to Apple. Rotating identifiers, not personal information, are assigned to any data or requests sent to Apple and it augments the “ground truth” data provided by its own mapping vehicles with this “probe data” sent back from iPhones.

Because only random segments of any person’s drive is ever sent and that data is completely anonymized, there is never a way to tell if any trip was ever a single individual. The local system signs the IDs and only it knows to whom that ID refers. Apple is working very hard here to not know anything about its users. This kind of privacy can’t be added on at the end, it has to be woven in at the ground level.

Because Apple’s business model does not rely on it serving to you, say, an ad for a Chevron on your route, it doesn’t need to even tie advertising identifiers to users.

Any personalization or Siri requests are all handled on-board by the iOS device’s processor. So if you get a drive notification that tells you it’s time to leave for your commute, that’s learned, remembered and delivered locally, not from Apple’s servers.

That’s not new, but it’s important to note given the new thing to take away here: Apple is flipping on the power of having millions of iPhones passively and actively improving their mapping data in real time.

In short: Traffic, real-time road conditions, road systems, new construction and changes in pedestrian walkways are about to get a lot better in Apple Maps.

The secret sauce here is what Apple calls probe data. Essentially little slices of vector data that represent direction and speed transmitted back to Apple completely anonymized with no way to tie it to a specific user or even any given trip. It’s reaching in and sipping a tiny amount of data from millions of users instead, giving it a holistic, real-time picture without compromising user privacy.

If you’re driving, walking or cycling, your iPhone can already tell this. Now if it knows you’re driving, it also can send relevant traffic and routing data in these anonymous slivers to improve the entire service. This only happens if your Maps app has been active, say you check the map, look for directions, etc. If you’re actively using your GPS for walking or driving, then the updates are more precise and can help with walking improvements like charting new pedestrian paths through parks — building out the map’s overall quality.

All of this, of course, is governed by whether you opted into location services, and can be toggled off using the maps location toggle in the Privacy section of settings.

Apple says that this will have a near zero effect on battery life or data usage, because you’re already using the ‘maps’ features when any probe data is shared and it’s a fraction of what power is being drawn by those activities.

From the point cloud on up

But maps cannot live on ground truth and mobile data alone. Apple is also gathering new high-resolution satellite data to combine with its ground truth data for a solid base map. It’s then layering satellite imagery on top of that to better determine foliage, pathways, sports facilities, building shapes and pathways.

After the downstream data has been cleaned up of license plates and faces, it gets run through a bunch of computer vision programming to pull out addresses, street signs and other points of interest. These are cross referenced to publicly available data like addresses held by the city and new construction of neighborhoods or roadways that comes from city planning departments.

But one of the special sauce bits that Apple is adding to the mix of mapping tools is a full-on point cloud that maps in 3D the world around the mapping van. This allows them all kinds of opportunities to better understand what items are street signs (retro-reflective rectangular object about 15 feet off the ground? Probably a street sign) or stop signs or speed limit signs.

It seems like it also could enable positioning of navigation arrows in 3D space for AR navigation, but Apple declined to comment on “any future plans” for such things.

Apple also uses semantic segmentation and Deep Lambertian Networks to analyze the point cloud coupled with the image data captured by the car and from high-resolution satellites in sync. This allows 3D identification of objects, signs, lanes of traffic and buildings and separation into categories that can be highlighted for easy discovery.

The coupling of high-resolution image data from car and satellite, plus a 3D point cloud, results in Apple now being able to produce full orthogonal reconstructions of city streets with textures in place. This is massively higher-resolution and easier to see, visually. And it’s synchronized with the “panoramic” images from the car, the satellite view and the raw data. These techniques are used in self-driving applications because they provide a really holistic view of what’s going on around the car. But the ortho view can do even more for human viewers of the data by allowing them to “see” through brush or tree cover that would normally obscure roads, buildings and addresses.

This is hugely important when it comes to the next step in Apple’s battle for supremely accurate and useful Maps: human editors.

Apple has had a team of tool builders working specifically on a toolkit that can be used by human editors to vet and parse data, street by street. The editor’s suite includes tools that allow human editors to assign specific geometries to flyover buildings (think Salesforce tower’s unique ridged dome) that allow them to be instantly recognizable. It lets editors look at real images of street signs shot by the car right next to 3D reconstructions of the scene and computer vision detection of the same signs, instantly recognizing them as accurate or not.

Another tool corrects addresses, letting an editor quickly move an address to the center of a building, determine whether they’re misplaced and shift them around. It also allows for access points to be set, making Apple Maps smarter about the “last 50 feet” of your journey. You’ve made it to the building, but what street is the entrance actually on? And how do you get into the driveway? With a couple of clicks, an editor can make that permanently visible.

“When we take you to a business and that business exists, we think the precision of where we’re taking you to, from being in the right building,” says Cue. “When you look at places like San Francisco or big cities from that standpoint, you have addresses where the address name is a certain street, but really, the entrance in the building is on another street. They’ve done that because they want the better street name. Those are the kinds of things that our new Maps really is going to shine on. We’re going to make sure that we’re taking you to exactly the right place, not a place that might be really close by.”

Water, swimming pools (new to Maps entirely), sporting areas and vegetation are now more prominent and fleshed out thanks to new computer vision and satellite imagery applications. So Apple had to build editing tools for those, as well.

Many hundreds of editors will be using these tools, in addition to the thousands of employees Apple already has working on maps, but the tools had to be built first, now that Apple is no longer relying on third parties to vet and correct issues.

And the team also had to build computer vision and machine learning tools that allow it to determine whether there are issues to be found at all.

Anonymous probe data from iPhones, visualized, looks like thousands of dots, ebbing and flowing across a web of streets and walkways, like a luminescent web of color. At first, chaos. Then, patterns emerge. A street opens for business, and nearby vessels pump orange blood into the new artery. A flag is triggered and an editor looks to see if a new road needs a name assigned.

A new intersection is added to the web and an editor is flagged to make sure that the left turn lanes connect correctly across the overlapping layers of directional traffic. This has the added benefit of massively improved lane guidance in the new Apple Maps.

Apple is counting on this combination of human and AI flagging to allow editors to first craft base maps and then also maintain them as the ever-changing biomass wreaks havoc on roadways, addresses and the occasional park.

Here there be Helvetica

Apple’s new Maps, like many other digital maps, display vastly differently depending on scale. If you’re zoomed out, you get less detail. If you zoom in, you get more. But Apple has a team of cartographers on staff that work on more cultural, regional and artistic levels to ensure that its Maps are readable, recognizable and useful.

These teams have goals that are at once concrete and a bit out there — in the best traditions of Apple pursuits that intersect the technical with the artistic.

The maps need to be usable, but they also need to fulfill cognitive goals on cultural levels that go beyond what any given user might know they need. For instance, in the U.S., it is very common to have maps that have a relatively low level of detail even at a medium zoom. In Japan, however, the maps are absolutely packed with details at the same zoom, because that increased information density is what is expected by users.

This is the department of details. They’ve reconstructed replicas of hundreds of actual road signs to make sure that the shield on your navigation screen matches the one you’re seeing on the highway road sign. When it comes to public transport, Apple licensed all of the type faces that you see on your favorite subway systems, like Helvetica for NYC. And the line numbers are in the exact same order that you’re going to see them on the platform signs.

It’s all about reducing the cognitive load that it takes to translate the physical world you have to navigate into the digital world represented by Maps.

Bottom line

The new version of Apple Maps will be in preview next week with just the Bay Area of California going live. It will be stitched seamlessly into the “current” version of Maps, but the difference in quality level should be immediately visible based on what I’ve seen so far.

Better road networks, more pedestrian information, sports areas like baseball diamonds and basketball courts, more land cover, including grass and trees, represented on the map, as well as buildings, building shapes and sizes that are more accurate. A map that feels more like the real world you’re actually traveling through.

Search is also being revamped to make sure that you get more relevant results (on the correct continents) than ever before. Navigation, especially pedestrian guidance, also gets a big boost. Parking areas and building details to get you the last few feet to your destination are included, as well.

What you won’t see, for now, is a full visual redesign.

“You’re not going to see huge design changes on the maps,” says Cue. “We don’t want to combine those two things at the same time because it would cause a lot of confusion.”

Apple Maps is getting the long-awaited attention it really deserves. By taking ownership of the project fully, Apple is committing itself to actually creating the map that users expected of it from the beginning. It’s been a lingering shadow on iPhones, especially, where alternatives like Google Maps have offered more robust feature sets that are so easy to compare against the native app but impossible to access at the deep system level.

The argument has been made ad nauseam, but it’s worth saying again that if Apple thinks that mapping is important enough to own, it should own it. And that’s what it’s trying to do now.

“We don’t think there’s anybody doing this level of work that we’re doing,” adds Cue. “We haven’t announced this. We haven’t told anybody about this. It’s one of those things that we’ve been able to keep pretty much a secret. Nobody really knows about it. We’re excited to get it out there. Over the next year, we’ll be rolling it out, section by section in the U.S.”

Powered by WPeMatico

8 big announcements from Google I/O 2018

Posted by | Android, Apps, artificial intelligence, Developer, Facebook, Google, Google Assistant, Google I/O 2018, machine learning, ML, natural language processing, Sundar Pichai, TC, YouTube | No Comments

Google kicked off its annual I/O developer conference at Shoreline Amphitheater in Mountain View, California. Here are some of the biggest announcements from the Day 1 keynote. There will be more to come over the next couple of days, so follow along on everything Google I/O on TechCrunch. 

Google goes all in on artificial intelligence, rebranding its research division to Google AI

Just before the keynote, Google announced it is rebranding its Google Research division to Google AI. The move signals how Google has increasingly focused R&D on computer vision, natural language processing, and neural networks.

Google makes talking to the Assistant more natural with “continued conversation”

What Google announced: Google announced a “continued conversation” update to Google Assistant that makes talking to the Assistant feel more natural. Now, instead of having to say “Hey Google” or “OK Google” every time you want to say a command, you’ll only have to do so the first time. The company also is adding a new feature that allows you to ask multiple questions within the same request. All this will roll out in the coming weeks.

Why it’s important: When you’re having a typical conversation, odds are you are asking follow-up questions if you didn’t get the answer you wanted. But it can be jarring to have to say “Hey Google” every single time, and it breaks the whole flow and makes the process feel pretty unnatural. If Google wants to be a significant player when it comes to voice interfaces, the actual interaction has to feel like a conversation — not just a series of queries.

Google Photos gets an AI boost

What Google announced: Google Photos already makes it easy for you to correct photos with built-in editing tools and AI-powered features for automatically creating collages, movies and stylized photos. Now, Photos is getting more AI-powered fixes like B&W photo colorization, brightness correction and suggested rotations. A new version of the Google Photos app will suggest quick fixes and tweaks like rotations, brightness corrections or adding pops of color.

Why it’s important: Google is working to become a hub for all of your photos, and it’s able to woo potential users by offering powerful tools to edit, sort, and modify those photos. Each additional photo Google gets offers it more data and helps them get better and better at image recognition, which in the end not only improves the user experience for Google, but also makes its own tools for its services better. Google, at its heart, is a search company — and it needs a lot of data to get visual search right.

Google Assistant and YouTube are coming to Smart Displays

What Google announced: Smart Displays were the talk of Google’s CES push this year, but we haven’t heard much about Google’s Echo Show competitor since. At I/O, we got a little more insight into the company’s smart display efforts. Google’s first Smart Displays will launch in July, and of course will be powered by Google Assistant and YouTube . It’s clear that the company’s invested some resources into building a visual-first version of Assistant, justifying the addition of a screen to the experience.

Why it’s important: Users are increasingly getting accustomed to the idea of some smart device sitting in their living room that will answer their questions. But Google is looking to create a system where a user can ask questions and then have an option to have some kind of visual display for actions that just can’t be resolved with a voice interface. Google Assistant handles the voice part of that equation — and having YouTube is a good service that goes alongside that.

Google Assistant is coming to Google Maps

What Google announced: Google Assistant is coming to Google Maps, available on iOS and Android this summer. The addition is meant to provide better recommendations to users. Google has long worked to make Maps seem more personalized, but since Maps is now about far more than just directions, the company is introducing new features to give you better recommendations for local places.

The maps integration also combines the camera, computer vision technology, and Google Maps with Street View. With the camera/Maps combination, it really looks like you’ve jumped inside Street View. Google Lens can do things like identify buildings, or even dog breeds, just by pointing your camera at the object in question. It will also be able to identify text.

Why it’s important: Maps is one of Google’s biggest and most important products. There’s a lot of excitement around augmented reality — you can point to phenomena like Pokémon Go — and companies are just starting to scratch the surface of the best use cases for it. Figuring out directions seems like such a natural use case for a camera, and while it was a bit of a technical feat, it gives Google yet another perk for its Maps users to keep them inside the service and not switch over to alternatives. Again, with Google, everything comes back to the data, and it’s able to capture more data if users stick around in its apps.

Google announces a new generation for its TPU machine learning hardware

What Google announced: As the war for creating customized AI hardware heats up, Google said that it is rolling out its third generation of silicon, the Tensor Processor Unit 3.0. Google CEO Sundar Pichai said the new TPU is 8x more powerful than last year per pod, with up to 100 petaflops in performance. Google joins pretty much every other major company in looking to create custom silicon in order to handle its machine operations.

Why it’s important: There’s a race to create the best machine learning tools for developers. Whether that’s at the framework level with tools like TensorFlow or PyTorch or at the actual hardware level, the company that’s able to lock developers into its ecosystem will have an advantage over the its competitors. It’s especially important as Google looks to build its cloud platform, GCP, into a massive business while going up against Amazon’s AWS and Microsoft Azure. Giving developers — who are already adopting TensorFlow en masse — a way to speed up their operations can help Google continue to woo them into Google’s ecosystem.

MOUNTAIN VIEW, CA – MAY 08: Google CEO Sundar Pichai delivers the keynote address at the Google I/O 2018 Conference at Shoreline Amphitheater on May 8, 2018 in Mountain View, California. Google’s two day developer conference runs through Wednesday May 9. (Photo by Justin Sullivan/Getty Images)

Google News gets an AI-powered redesign

What Google announced: Watch out, Facebook . Google is also planning to leverage AI in a revamped version of Google News. The AI-powered, redesigned news destination app will “allow users to keep up with the news they care about, understand the full story, and enjoy and support the publishers they trust.” It will leverage elements found in Google’s digital magazine app, Newsstand and YouTube, and introduces new features like “newscasts” and “full coverage” to help people get a summary or a more holistic view of a news story.

Why it’s important: Facebook’s main product is literally called “News Feed,” and it serves as a major source of information for a non-trivial portion of the planet. But Facebook is embroiled in a scandal over personal data of as many as 87 million users ending up in the hands of a political research firm, and there are a lot of questions over Facebook’s algorithms and whether they surface up legitimate information. That’s a huge hole that Google could exploit by offering a better news product and, once again, lock users into its ecosystem.

Google unveils ML Kit, an SDK that makes it easy to add AI smarts to iOS and Android apps

What Google announced: Google unveiled ML Kit, a new software development kit for app developers on iOS and Android that allows them to integrate pre-built, Google-provided machine learning models into apps. The models support text recognition, face detection, barcode scanning, image labeling and landmark recognition.

Why it’s important: Machine learning tools have enabled a new wave of use cases that include use cases built on top of image recognition or speech detection. But even though frameworks like TensorFlow have made it easier to build applications that tap those tools, it can still take a high level of expertise to get them off the ground and running. Developers often figure out the best use cases for new tools and devices, and development kits like ML Kit help lower the barrier to entry and give developers without a ton of expertise in machine learning a playground to start figuring out interesting use cases for those appliocations.

So when will you be able to actually play with all these new features? The Android P beta is available today, and you can find the upgrade here.

Powered by WPeMatico

‘SmartLens’ app created by a high schooler is a step towards all-purpose visual search

Posted by | Apps, artificial intelligence, Computer Vision, machine learning, Mobile, TC | No Comments

A couple of years ago I was eagerly expectant of an app that would identify anything you pointed it at. Turns out the problem was much harder than anyone expected — but that didn’t stop high school senior Michael Royzen from trying. His app, SmartLens, attempts to solve the problem of seeing something and wanting to identify and learn more about it — with mixed success, to be sure, but it’s something I don’t mind having in my pocket.

Royzen reached out to me a while back and I was curious — as well as skeptical — about the idea that where the likes of Google and Apple have so far failed (or at least failed to release anything good), a high schooler working in his spare time would succeed. I met him at a coffee shop to see the app in action and was pleasantly surprised, but a little baffled.

The idea is simple, of course: You point your phone’s camera at something and the app attempts to identify it using an enormous but highly optimized classification agent trained on tens of millions of images. It connects to Wikipedia and Amazon to let you immediately learn more about what you’ve ID’ed, or buy it.

It recognizes more than 17,000 objects — things like different species of fruit and flower, landmarks, tools and so on. The app had little trouble telling an apple from a (weird-looking) mango, a banana from a plantain and even identified the pistachios I’d ordered as a snack. Later, in my own testing, I found it quite useful for identifying the plants springing up in my neighborhood: periwinkles, anemones, wood sorrel, it got them all, though not without the occasional hesitation.

The kicker is that this all happens offline — it’s not sending an image over the cell network or Wi-Fi to a server somewhere to be analyzed. It all happens on-device and within a second or two. Royzen scraped his own image database from various sources and trained up multiple convolutional neural networks using days of AWS EC2 compute time.

Then there are far more than that number in products that it recognizes by reading the text of the item and querying the Amazon database. It ID’ed books, a bottle of pills and other packaged goods almost instantly, providing links to buy them. Wikipedia links pop up if you’re online as well, though a considerable amount of basic descriptions are kept on the device.

On that note, it must be said that SmartLens is a more than 500-megabyte download. Royzen’s model is huge, since it must keep all the recognition data and offline content right there on the phone. This is a much different approach to the problem than Amazon’s own product recognition engine on the Fire Phone (RIP) or Google Goggles (RIP) or the scan feature in Google Photos (which was pretty useless for things SmartLens reliably did in half a second).

“With the several past generations of smartphones containing desktop-class processors and the advent of native machine learning APIs that can harness them (and GPUs), the hardware exists for a blazing-fast visual search engine,” Royzen wrote in an email. But none of the large companies you would expect to create one has done so. Why?

The app size and toll on the processor is one thing, for sure, but the edge and on-device processing is where all this stuff will go eventually — Royzen is just getting an early start. The likely truth is twofold: it’s hard to make money and the quality of the search isn’t high enough.

It must be said at this point that SmartLens, while smart, is far from infallible. Its suggestions for what an item might be are almost always hilariously wrong for a moment before arriving at, as it often does, the correct answer.

It identified one book I had as “White Whale,” and no, it wasn’t Moby Dick. An actual whale paperweight it decided was a trowel. Many items briefly flashed guesses of “Human being” or “Product design” before getting to a guess with higher confidence. One flowering bush it identified as four or five different plants — including, of course, Human Being. My monitor was a “computer display,” “liquid crystal display,” “computer monitor,” “computer,” “computer screen,” “display device” and more. Game controllers were all “control.” A spatula was a wooden spoon (close enough), with the inexplicable subheading “booby prize.” What?!

This level of performance (and weirdness in general, however entertaining) wouldn’t be tolerated in a standalone product released by Google or Apple. Google Lens was slow and bad, but it’s just an optional feature in a working, useful app. If it put out a visual search app that identified flowers as people, the company would never hear the end of it.

And the other side of it is the monetization aspect. Although it’s theoretically convenient to be able to snap a picture of a book your friend has and instantly order it, it isn’t so much more convenient than taking a picture and searching for it later, or just typing the first few words into Google or Amazon, which will do the rest for you.

Meanwhile for the user there is still confusion. What can it identify? What can’t it identify? What do I need it to identify? It’s meant to ID many things, from dog breeds and storefronts, but it likely won’t identify, for example, a cool Bluetooth speaker or mechanical watch your friend has, or the creator of a painting at a local gallery (some paintings are recognized, though). As I used it I felt like I was only ever going to use it for a handful of tasks in which it had proven itself, like identifying flowers, but would be hesitant to try it on many other things when I might just be frustrated by some unknown incapability or unreliability.

And yet the idea that in the very near future there will not be something just like SmartLens is ridiculous to me. It seems so clearly something we will all take for granted in a few years. And it’ll be on-device, no need to upload your image to a server somewhere to be analyzed on your behalf.

Royzen’s app has its issues, but it works very well in many circumstances and has obvious utility. The idea that you could point your phone at the restaurant you’re across the street from and see Yelp reviews two seconds later — no need to open up a map or type in an address or name — is an extremely natural expansion of existing search paradigms.

“Visual search is still a niche, but my goal is to give people the taste of a future where one app can deliver useful information about anything around them — today,” wrote Royzen. “Still, it’s inevitable that big companies will launch their competing offerings eventually. My strategy is to beat them to market as the first universal visual search app and amass as many users as possible so I can stay ahead (or be acquired).”

My biggest gripe of all, however, is not the functionality of the app, but in how Royzen has decided to monetize it. Users can download it for free but upon opening it are immediately prompted to sign up for a $2/month subscription — before they can even see whether the app works or not. If I didn’t already know what the app did and didn’t do, I would delete it without a second thought upon seeing that dialog, and even knowing what I do, I’m not likely to pay in perpetuity for it.

A one-time fee to activate the app would be more than reasonable, and there’s always the option of referral codes for those Amazon purchases. But demanding rent from users who haven’t even tested the product is a non-starter. I’ve told Royzen my concerns and I hope he reconsiders.

It would also be nice to scan images you’ve already taken, or save images associated with searches. UI improvements like a confidence indicator or some kind of feedback to let you know it’s still working on identification would be nice as well — features that are at least theoretically on the way.

In the end I’m impressed with Royzen’s efforts — when I take a step back it’s amazing to me that it’s possible for a single person, let alone one in high school, to put together an app capable of completing such sophisticated computer vision tasks. It’s the kind of (over-) ambitious app-building one expects to come out of a big, playful company like the Google of a decade ago. This may be more of a curiosity than a tool right now, but so were the first text-based search engines.

SmartLens is in the App Store now — give it a shot.

Powered by WPeMatico

Machine Learning Zone: OpenAI competition takes on Sonic the Hedgehog

Posted by | artificial intelligence, Gaming, machine learning, OpenAI, science, sega, Sonic the Hedgehog, TC | No Comments

Retro video games have been a useful platform for machine learning research for years, and the systems created have been creeping through the classics, mastering them as they go. Sonic the Hedgehog may be the next to fall: OpenAI has announced a competition to apply machine learning to the classic Sega game.

It’s not vastly different from what’s been attempted before, things like playing Super Mario Bros or Space Invaders, or even the likes of Doom. But the rules are a bit different here.

A very basic summary of how AIs learn to play something like Mario is this: an algorithm is set up with some basic capabilities like recognizing objects on screen and monitoring the in-game score. It’s then set free on the game itself and allowed access to the controls, with the sole goal of maximizing its score.

Over millions of tries the machine learns that in order to score, it needs to hit start first, then that it needs to move to the right, then that goombas kill it (and stop it from scoring more), coins give it points and so on. It does this all basically from recognizing the shapes on the screen or, in some cases, from accessing the game geometry and system memory directly — it doesn’t care about the Princess, and it may develop strange behaviors that result from its single-minded pursuit of incrementing its score integer.

This one, for example, learned that it can glitch through the walls to get ahead quickly:

Great job!

Another thing the OpenAI folks point out is that these systems often learn on the games and levels on which they are evaluated. It’s a sort of “teaching to the test” situation. So in the new competition, not only are the systems more complicated than Mario’s (as anyone who’s played Sonic can tell you), but the systems created will be tested on levels to which they’ve had limited exposure.

They won’t be going in blind — the risk of an AI breaking from the first is too high. But while researchers will have all the time in the world to design a training and learning mechanism based on a selection of Sonic levels, the test will involve applying that training mechanism to a new set of levels, under a strict time limit (18 hours of game time).

This means you have to create an agent that understands not just one level of Sonic, but Sonic as a gestalt. If your AI knows all the shortcuts in Green Valley Zone, it may excel there, but when sent to the Chemical Zone, it’ll choke (like me) when it encounters the scary underwater parts.

You don’t jump like normal! It’s a lot of pressure with the stuff coming up!

It also means your algorithm has to train efficiently, which may involve all kinds of techniques and shortcuts. Minimizing training time means minimizing lazy learning and paying attention to multiple sources of information at once.

There are also different control methods, gimmicks and physics in each game, so it may be that identifying those before making the run could be critical to success. Really, there are all kinds of things to consider. (It’s making me want to go back and play these great games.)

Contestants will be using OpenAI’s Gym Retro platform, which essentially wraps an emulator playing Sonic (and a set of other Sega games) in the tools developers need to extract data, map inputs and so on.

Winners don’t get any cash or anything, but first through third place will get trophies and will have the opportunity to co-author a report on the contest. OpenAI’s reports are interesting and widely read, so it sounds like a good opportunity if you have the time and inclination — although, of course, “it’s great exposure” is the classic payment avoidance strategy.

There are lots more games in the package of games OpenAI is using — I’d like to see an AI take on Gunstar Heroes, or Golden Axe III.

Powered by WPeMatico

Rainforest Connection enlists machine learning to listen for loggers and jaguars in the Amazon

Posted by | artificial intelligence, Cloud, Gadgets, Google, GreenTech, logging, machine learning, Mobile, rainforest connection, science, tensorflow | No Comments

The vastness that makes the Amazon rainforest so diverse and fertile also makes it extremely difficult to protect. Rainforest Connection is a project started back in 2014 that used solar-powered second-hand phones as listening stations that could alert authorities to sounds of illegal logging. And applying machine learning has supercharged the network’s capabilities.

The original idea is still in play: modern smartphones are powerful and versatile tools, and work well as wireless sound detectors. But as founder Topher White explained in an interview, the approach is limited to what you can get the phones to detect.

Originally, he said, the phones just listened for certain harmonics indicating, for example, a chainsaw. But bringing machine learning into the mix wrings much more out of the audio stream.

“Now we’re talking about detecting species, gunshots, voices, things that are more subtle,” he said. “And these models can improve over time. We can go back into years of recordings to figure out what patterns we can pull out of this. We’re turning this into a big data problem.”

White said he realized early on that the phones couldn’t do that kind of calculation, though — even if their efficiency-focused CPUs could do it, the effort would probably drain the battery. So he began working with Google’s TensorFlow platform to perform the training and integration of new data in the cloud.

Google also helped produce a nice little documentary about one situation where Guardians could help native populations deter loggers and poachers:

That’s in the Amazon, obviously, but Rainforest Connection has also set up stations in Cameroon and Sumatra, with others on the way.

Machine learning models are particularly good at finding patterns in noisy data that sound logical but defy easy identification through other means.

For instance, White said, “We should be able to detect animals that don’t make sounds. Jaguars might not always be vocalizing, but the animals around them are, birds and things.” The presence of a big cat then, might be easier to detect by listening for alarmed bird calls than for its near-silent movement through the forest.

The listening stations can be placed as far as 25 kilometers (about 15 miles) from the nearest cell tower. And because a device can detect chainsaws a kilometer away and some species half a kilometer away, it’s not like they need to be on every tree.

But, as you may know, the Amazon is rather a big forest. He wants more people to get involved, especially students. White partnered with Google to launch a pilot program where kids can build their own “Guardian,” as the augmented phone kits are called. When I talked with him it was moments before one such workshop in LA.

Topher White and students at one of the Guardian building workshops.

“We’ve already done three schools and I think a couple hundred students, plus three more in about half an hour,” he told me. “And all these devices will be deployed in the Amazon over the next three weeks. On Earth day they’ll be able to see them, and download the app to stream the sounds. It’s to show these kids that what they do can have an immediate effect.”

“An important part is making it inclusive, proving these things can be built by anyone in the world, and showing how anyone can access the data and do something cool with it. You don’t need to be a data scientist to do it,” he continued.

Getting more people involved is the key to the project, and to that end Rainforest Connection is working on a few new tricks. One is an app you’ll be able to download this summer “where people can put their phone on their windowsill and get alerts when there’s a species in the back yard.”

The other is a more public API; currently only partners like companies and researchers can access it. But with a little help, all the streams from the many online Guardians will be available for anyone to listen to, monitor and analyze. But that’s all contingent on having money.

“If we want to keep this program going, we need to find some funding,” White said. “We’re looking at grants and at corporate sponsorship — it’s a great way to get kids involved too, in both technology and ecology.”

Donations help, but partnerships with hardware makers and local businesses are more valuable. Want to join up? You can get at Rainforest Connection here.

Powered by WPeMatico

Google’s new YouTube Stories feature lets you swap out your background (no green screen required)

Posted by | artificial intelligence, augmented reality, Gadgets, Google, machine learning, neural networks, TC, YouTube, youtube stories | No Comments

 Google researchers know how much people like to trick others into thinking they’re on the moon, or that it’s night instead of day, and other fun shenanigans only possible if you happen to be in a movie studio in front of a green screen. So they did what any good 2018 coder would do: build a neural network that lets you do it. Read More

Powered by WPeMatico