読者です 読者をやめる 読者になる 読者になる

福岡は今日も雨

情報系大学生のブログ。主に技術,音楽について。

MNIST DatasetをChainerで,GPUを利用して学習するまで

もう少ししたら発表があることに気づいて必死でモデルを組み始めた。
ちょっとChainerを思い出しながら書いて,GPUも手に入って環境も整ったこともあり,一回MNISTを分類してみようかと。

いろんなブログやらqiitaを参考にしながらもう一度書いてみた。
CPUモードからGPUモードに変更するときは

# GPUの設定
cuda.get_device(0).use()
model.to_gpu()
# modelの設定後(この場合インスタンス化のあと)に行うこと。

これと

xp = cuda.cupy

これでnumpyの配列をすべてxpで変えていく。
注意として,numpyにはあったメソッドがxpでなかったりするので,そこらへんはちょろちょろ自分で書いたりしなきゃいけないかも.
今回少しだけはまった点としては

 def forward(self, x_data, y_data, train=True):
        x, t = Variable(cuda.to_gpu(x_data)), Variable(cuda.to_gpu(y_data))
        h1 = F.dropout(F.relu(self.l1(x)), train=train)
        h2 = F.dropout(F.relu(self.l2(h1)), train=train)
        y = self.l3(h2)

ここのforwardだが,x_data,y_dataの引数部分をcuda.to_gpuでcupyモードにしなければならない
同様に,matplotlibはcupyに対応していないので,numpyに戻す必要があり。この時は逆のcuda.to_cpuを利用しよう。
以下に全ソースコード

# -*- coding: utf-8 -*-

import matplotlib.pyplot as plt
import numpy as np
from sklearn.datasets import fetch_mldata
from chainer import cuda, Variable, optimizers, Chain
import chainer.functions as F
import sys

plt.style.use("ggplot")

batchsize = 100
n_epoch = 20
n_units = 1000  # 中間層
pixel_size = 28
xp = cuda.cupy

# Chainerのクラス作成
class MNISTChain(Chain):
    def __init__(self):
        super(MNISTChain, self).__init__(
            l1=F.Linear(784, n_units),
            l2=F.Linear(n_units, n_units),
            l3=F.Linear(n_units, 10)
        )

    def forward(self, x_data, y_data, train=True):
        x, t = Variable(cuda.to_gpu(x_data)), Variable(cuda.to_gpu(y_data))
        h1 = F.dropout(F.relu(self.l1(x)), train=train)
        h2 = F.dropout(F.relu(self.l2(h1)), train=train)
        y = self.l3(h2)

        # 交差エントロピー関数を誤差関数とする
        return F.softmax_cross_entropy(y, t), F.accuracy(y, t)


# MNISTの画像データDL
print("fetch MNIST dataset")
mnist = fetch_mldata('MNIST original', data_home=".")
# mnist.data : 70,000件の28x28=784次元ベクトルデータ
mnist.data = mnist.data.astype(xp.float32)
mnist.data /= 255  # 正規化

# mnist.target : 正解データ
mnist.target = mnist.target.astype(xp.int32)

# 学習用データN個,検証用データを残りの個数に設定
N = 60000
x_train, x_test = xp.split(mnist.data, [N])
y_train, y_test = xp.split(mnist.target, [N])
N_test = y_test.size

# modelを書く
model = MNISTChain()
# GPUの設定
cuda.get_device(0).use()
model.to_gpu()
# optimizerの設定
optimizer = optimizers.Adam()
optimizer.setup(model)

# train and show results
train_loss = []
train_acc = []
test_loss = []
test_acc = []

# Learning loop
for epoch in range(1, n_epoch + 1):
    print("epoch", epoch)

    # training
    # Nこの順番をランダムに並び替える
    perm = np.random.permutation(N)
    sum_accuracy = 0
    sum_loss = 0

    # 0~Nまでのデータをバッチサイズごとに使って学習
    for i in range(0, N, batchsize):
        x_batch = x_train[perm[i:i + batchsize]]
        y_batch = y_train[perm[i:i + batchsize]]

        # 勾配を初期化
        optimizer.zero_grads()
        # 順伝播させて誤差と精度を算出
        loss, acc = model.forward(x_batch, y_batch)
        # 誤差逆伝播で勾配を計算
        loss.backward()
        optimizer.update()

        train_loss.append(cuda.to_cpu(loss.data))
        train_acc.append(cuda.to_cpu(acc.data))
        sum_loss += float(cuda.to_cpu(loss.data)) * batchsize
        sum_accuracy += float(cuda.to_cpu(acc.data)) * batchsize

    # 訓練データの誤差と正解精度を表示
    print("train mean loss = {0}, accuracy = {1}".format(sum_loss / N, sum_accuracy / N))

    # evaluation
    # テストデータで誤差と正解精度を算出し汎化性能を確認
    sum_accuracy = 0
    sum_loss = 0
    for i in range(0, N_test, batchsize):
        x_batch = x_test[i:i + batchsize]
        y_batch = y_test[i:i + batchsize]

        # 順伝播させて誤差と精度を算出
        loss, acc = model.forward(x_batch, y_batch, train=False)

        test_loss.append(cuda.to_cpu(loss.data))
        test_acc.append(cuda.to_cpu(acc.data))
        sum_loss += float(cuda.to_cpu(loss.data)) * batchsize
        sum_accuracy += float(cuda.to_cpu(acc.data)) * batchsize

    # テストデータの誤差と正解精度を表示
    print("test  mean loss = {0}, accuracy = {1}".format(sum_loss / N_test, sum_accuracy / N_test))


# 精度と誤差をグラフ描画
plt.figure(figsize=(8, 6))
print(train_acc)
plt.plot(range(len(train_acc)), train_acc)
plt.plot(range(len(test_acc)), test_acc)
plt.legend(["train_acc", "test_acc"], loc=4)
plt.title("Accuracy of MNIST recognition.")
plt.plot()
plt.show()

結果
epoch 1
train mean loss = 0.279147822180142, accuracy = 0.9147833364456892
test mean loss = 0.1345167052175384, accuracy = 0.9583000040054321
epoch 2
train mean loss = 0.13670490186351042, accuracy = 0.9580500034491221
test mean loss = 0.09032138824637513, accuracy = 0.9720000040531158
epoch 3
train mean loss = 0.10736077687082192, accuracy = 0.9674000060558319
test mean loss = 0.07859976113279117, accuracy = 0.9760000038146973
epoch 4
train mean loss = 0.09530005491416281, accuracy = 0.9700833410024643
test mean loss = 0.07128305285063107, accuracy = 0.9774000066518783
epoch 5
train mean loss = 0.084215888553299, accuracy = 0.9739500084519386
test mean loss = 0.06781253896362614, accuracy = 0.9797000044584274
epoch 6
train mean loss = 0.07421663962226982, accuracy = 0.9765500106414159
test mean loss = 0.06678849312360398, accuracy = 0.9806000036001206
epoch 7
train mean loss = 0.07206320945755579, accuracy = 0.9780166767040889
test mean loss = 0.07613256934724631, accuracy = 0.9780000054836273
epoch 8
train mean loss = 0.06645605529270446, accuracy = 0.9794666781028112
test mean loss = 0.0694200206313326, accuracy = 0.9796000057458878
epoch 9
train mean loss = 0.06027518025541213, accuracy = 0.9811333427826564
test mean loss = 0.06582227945725208, accuracy = 0.981700006723404
epoch 10
train mean loss = 0.06272635982527087, accuracy = 0.9803333427508673
test mean loss = 0.06362627781827541, accuracy = 0.9824000054597855
epoch 11
train mean loss = 0.05525149108822613, accuracy = 0.9822166780630748
test mean loss = 0.059438277111039495, accuracy = 0.984400006532669
epoch 12
train mean loss = 0.05769200769641126, accuracy = 0.9823666775226593
test mean loss = 0.06064008094675955, accuracy = 0.9821000069379806
epoch 13
train mean loss = 0.05246836805231093, accuracy = 0.9837333444754283
test mean loss = 0.06008758193208678, accuracy = 0.9841000068187714
epoch 14
train mean loss = 0.05325017308399159, accuracy = 0.983550010919571
test mean loss = 0.06348420954424, accuracy = 0.9837000048160554
epoch 15
train mean loss = 0.04687451647075553, accuracy = 0.9857833432157834
test mean loss = 0.06079861531910865, accuracy = 0.9840000057220459
epoch 16
train mean loss = 0.05024066871997396, accuracy = 0.9843666763106982
test mean loss = 0.06952258783453544, accuracy = 0.9837000060081482
epoch 17
train mean loss = 0.04789641752979757, accuracy = 0.985533343354861
test mean loss = 0.0688758676241605, accuracy = 0.9830000072717666
epoch 18
train mean loss = 0.04523991203939658, accuracy = 0.9865000093976657
test mean loss = 0.06825290421969839, accuracy = 0.9847000086307526
epoch 19
train mean loss = 0.04624386687396812, accuracy = 0.9864333433906237
test mean loss = 0.06506712173536017, accuracy = 0.9841000074148178
epoch 20
train mean loss = 0.04339798055817179, accuracy = 0.9869333428144454
test mean loss = 0.07073728352328346, accuracy = 0.9837000060081482

pltの結果
f:id:taichitary:20170216215346p:plain

誤差が減り,正答率が上がった。うまく学習が進んだとわかる。