About a year ago, Google announced the launch of Vertex AI, a managed AI platform designed to help enterprises accelerate the deployment of AI models. To mark the service’s anniversary and the launch of Google’s Applied ML Summit, Google this morning announced new features for Vertex, including a dedicated AI system training server and “example-based” explanations. .
“We launched Vertex AI a year ago with the goal of enabling a new generation of AI that empowers data scientists and engineers to do fulfilling and creative work,” said Henry Tappen, the group’s chief product officer. Google Cloud, to TechCrunch via email. “The new Vertex AI capabilities we are launching today will continue to accelerate the deployment of machine learning models in organizations and democratize AI so that more people can deploy models in production, continuously monitor and driving business impact with AI.
As Google has always presented, the advantage of Vertex is that it brings together Google Cloud services for AI under a unified user interface and API. Customers such as Ford, Seagate, Wayfair, Cashapp, Cruise and Lowe’s use the service to build, train and deploy machine learning models in a single environment, Google says, taking the models from experimentation to production.
Vertex competes with managed AI platforms from cloud providers like Amazon Web Services and Azure. Technically, it falls under the category of platforms known as MLOps, a set of best practices for businesses to run AI. Deloitte predicts that the MLOps market will reach $4 billion in 2025, growing nearly 12x since 2019.
Gartner predicts that the emergence of managed services like Vertex will drive cloud market growth of 18.4% in 2021, with cloud expected to account for 14.2% of total global IT spending. “As enterprises increase their investments in mobility, collaboration, and other remote working technologies and infrastructure, the growth of the public cloud [will] be sustained through 2024,” Gartner wrote in a November 2020 study.
Among Vertex’s new features is the AI Training Reduction Server, a technology that Google says optimizes the bandwidth and latency of multi-system distributed training on Nvidia GPUs. In machine learning, “distributed training” refers to spreading the work of training a system across multiple machines, GPUs, CPUs, or custom chips, thereby reducing the time and resources needed to complete the training.
“This dramatically reduces the training time required for large language workloads, like BERT, and further enables cost parity between different approaches,” said Andrew Moore, vice president and general manager of cloud AI at Google, in an article published today on the Google Cloud blog. “In many critical business scenarios, a shortened training cycle allows data scientists to train a model with higher predictive performance within the confines of a deployment window.”
As a preview, Vertex also now offers tabular workflows, which aim to bring greater customization to the model building process. As Moore explained, Tabular Workflows allows the user to choose which parts of the workflow they want Google’s “AutoML” technology to handle versus which parts they want to design themselves. AutoML, or Automated Machine Learning – which is not unique to Google Cloud or Vertex – encompasses any technology that automates aspects of AI development and can address development stages from inception with a raw data set to creating a ready-to-deploy machine learning model. AutoML can save time, but can’t always beat a human touch, especially when precision is required.
“Tabular workflow elements can also be integrated into your existing Vertex AI pipelines,” Moore said. “We’ve added new supported algorithms, including advanced search models like TabNet, new algorithms for feature selection, model distillation, and… more.”
Allied to development pipelines, Vertex also gains (pre-release) integration with serverless Spark, the serverless version of the open-source analytics engine maintained by Apache for data processing. Now Vertex users can launch a serverless Spark session to interactively develop code.
Elsewhere, customers can analyze data features in Neo4j’s platform and then deploy models using Vertex through a new partnership with Neo4j. And – thanks to a collaboration between Google and Labelbox – it’s now easier to access Labelbox’s data labeling services for images, text, audio and video from the Vertex dashboard. Labels are necessary for most AI models to learn to make predictions; models train to identify relationships between labels, also called annotations, and sample data (for example, the caption “frog” and a photo of a frog).
In the event that the data is mislabeled, Moore offers example-based explanations as a solution. Available as a preview, new Vertex features rely on “example-based” explanations to help diagnose and address data issues. Of course, no explainable AI technique can detect all errors; Computational linguist Vagrant Gautam warns against overconfidence in the tools and techniques used to explain AI.
“Google has documentation on limitations and a more detailed white paper on explainable AI, but none of that is mentioned anywhere. [today’s Vertex AI announcement]they told TechCrunch via email. “The ad emphasizes that ‘proficiency in skills shouldn’t be the criteria for triggering participation’ and that the new features they provide can ‘evolve AI for non-software experts.’ is that non-experts have more confidence in AI and AI explainability than they should, and now various Google customers can build and deploy models faster without stopping to ask s this is a problem that requires a machine learning solution in the first place, and calling their models explainable (and therefore trustworthy and good) without knowing the full extent of the boundaries around them for their special cases.
Still, Moore suggests that example-based explanations can be a useful tool when used in tandem with other model auditing practices.
“Data scientists shouldn’t need to be infrastructure engineers or operations engineers to maintain accurate, explainable, scalable, disaster-resistant, and secure models in an ever-changing environment,” added Moore. “Our customers demand tools to easily manage and maintain machine learning models. “