Introduction

Google Cloud provides several Deep Learning images which are pre-configured with key ML frameworks and tools, and that can be run out of the box. We will write a bash script for setting up an instance with

  • 2x CPUs with 13GB RAM
  • 1x Nvidia Tesla K80 GPU
  • 50GB disk size
  • TensorFlow GPU

Note that the resulting instance will have many more packages pre-installed.

I will assume the Cloud SDK is already installed and set up on your system.

Bash script

#!/usr/bin/env bash
export IMAGE_FAMILY='tf-latest-cu92'
export ZONE='us-central1-c'
export INSTANCE_NAME='tf-n1-highmem-2-k80-count-1'
export INSTANCE_TYPE='n1-highmem-2'
gcloud compute instances create $INSTANCE_NAME \
        --zone=$ZONE \
        --image-family=$IMAGE_FAMILY \
        --image-project=deeplearning-platform-release \
        --maintenance-policy=TERMINATE \
        --accelerator='type=nvidia-tesla-k80,count=1' \
        --machine-type=$INSTANCE_TYPE \
        --boot-disk-size=50GB \
        --metadata='install-nvidia-driver=True'

INSTANCE_NAME will be the name of your VM. It will be also displayed on the Google Cloud Platform. I chose a name which will remind of the specifications chosen. You can choose whatever name you like, as long as it there is no other VM/instance with the same name.

--zone: You can pick which ever zone suits you best from the following list. However, note that not every zone provides GPUs or that some zones provide only a subset of all availble GPU types. For more info which GPU can be found on which zone have a look here. Also note that the prices might vary from zone to zone. I picked us-central1-c simply because it provides all GPU types and belongs to the cheapest zones.

--image-family: tf-latest-cu92 installs the latest Tensorflow GPU version. You can specify a specific version as well: tf-VERSION-cu92 (e.g. tf-1-8-cu92). Other possible images are listed up here.

--image-project: this must be deeplearning-platform-release

--maintenance-policy: must be TERMINATE

--accelerator: here we specify the GPU type to use. The format is 'type=TYPE,count=COUNT'. More info regarding available types here.

--machine-type: here we specify the CPU and memory. There are standard machine types and high-memory machine types. All possible types can be found here.

--metadata: here we specify that the NVIDIA driver should be installed via install-nvidia-driver=True. With this flag it may take up to 5 min until the VM is fully provisioned.

Executing our new command

We give the script the name create-gcloud-tf-instance.sh, place it under ~/bin (might need to create it since it is not there by default) and add the following lines to our .bash_profile:

# Add bin directory to path
export PATH=~/bin:"$PATH"

After restarting the terminal (or running source .bash_profile) we need to update the file permissions accordingly. This can be done from within the terminal via:

$ chmod 700 bin/create-gcloud-tf-instance.sh

Since we just added the bin folder to our path, we can now execute our new command just like any other built-in command (tab completion should also work here):

$ create-gcloud-tf-instance.sh
WARNING: You have selected a disk size of under [200GB]. This may result in poor I/O performance. For more information, see: https://developers.google.com/compute/docs/disks#performance.
Created [https://www.googleapis.com/compute/v1/projects/deeprl-1/zones/us-central1-c/instances/blog].
NAME                         ZONE           MACHINE_TYPE  PREEMPTIBLE  INTERNAL_IP  EXTERNAL_IP    STATUS
tf-n1-highmem-2-k80-count-1  us-central1-c  n1-highmem-2               10.128.0.3   35.239.61.170  RUNNING


Updates are available for some Cloud SDK components.  To install them,
please run:
  $ gcloud components update

Opening your cloud console in the browser https://console.cloud.google.com/, you will find your newly created instance.

The instance can be stopped form within the terminal via the gcloud SDK command

$ gcloud compute instances stop tf-n1-highmem-2-k80-count-1 --zone=us-central1-c 
Stopping instance(s) tf-n1-highmem-2-k80-count-1...⠏

Analoguously, replacing stop with start will boot your instance.

Port-forwarding JupyterLab

In order to use JupyterLab within the browser of your local PC, we will create a SSH tunnel that forwards port 8080 from your gcloud VM to port 8080 of your local PC:

$ gcloud compute ssh tf-n1-highmem-2-k80-count-1 --zone=us-central1-c -- -L 8080:localhost:8080
Enter passphrase for key '/Users/apoehlmann/.ssh/google_compute_engine': 
Warning: untrusted X11 forwarding setup failed: xauth key data not generated
======================================
Welcome to the Google Deep Learning VM
======================================

Based on: Debian GNU/Linux 9.5 (stretch) (GNU/Linux 4.9.0-8-amd64 x86_64\n)

Resources:
 * Google Deep Learning Platform StackOverflow: https://stackoverflow.com/questions/tagged/google-dl-platform
 * Google Cloud Documentation: https://cloud.google.com/deep-learning-vm
 * Google Group: https://groups.google.com/forum/#!forum/google-dl-platform

TensorFlow comes pre-installed with this image. To install TensorFlow binaries in a virtualenv (or conda env),
please use the binaries that are pre-built for this image. You can find the binaries at
/opt/deeplearning/binaries/tensorflow/
Note that public TensorFlow binaries may not work with this image.

Linux tf-n1-highmem-2-k80-count-1 4.9.0-8-amd64 #1 SMP Debian 4.9.110-3+deb9u4 (2018-08-21) x86_64

The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
apoehlmann@tf-n1-highmem-2-k80-count-1:~$

Note: the '--' argument must be specified between gcloud specific args on the left and SSH_ARGS on the right. For more info check the docs.

In your browser, you can now open JupyterLab via http://localhost:8080. It will automatically redirect to your VM's JupyterLab.

Getagged mit:
gcloud English
blog comments powered by Disqus