How Amazon SageMaker Autopilot works

How it works

Using a single API call, or a few clicks in Amazon SageMaker Studio, SageMaker Autopilot first inspects your data set, and runs a number of candidates to figure out the optimal combination of data preprocessing steps, machine learning algorithms and hyperparameters. Then, it uses this combination to train an Inference Pipeline, which you can easily deploy either on a real-time endpoint or for batch processing. As usual with Amazon SageMaker, all of this takes place on fully-managed infrastructure.

Last but not least, SageMaker Autopilot also generates Python code showing you exactly how data was pre-processed. This will help you understand what SageMaker Autopilot did in the backend, and you can also reuse that code for further manual tuning, if you’re so inclined.

As of today, SageMaker Autopilot supports:

  • Input data in tabular format, with automatic data cleaning and preprocessing,
  • Automatic algorithm selection for linear regression, binary classification, and multi-class classification,
  • Automatic hyperparameter optimization,
  • Distributed training,
  • Automatic instance and cluster size selection.

Also see these links for a walkthrough for what we will be doing today: