[#E10] 딥러닝의 기초 XOR 문제란?

DEEP LEARNING은 기본적으로 NEURAL NETWORK(NN)를 기반으로 하며, 머신러닝을 더욱 완전하게 만드는 방법으로 볼 수 있습니다.

아래의 그림은 x1과 x2 값에 따른 y의 값을 유동적으로 나타내고 있습니다.

그림과 같이 OR, AND 그래프에서는 데이터를 분류할 수 있는 Linear 라인을 정의할 수 있습니다. 그러나 XOR 그래프에서는 불가능합니다.

이러한 XOR 문제를 해결하기 위한 방법이 바로 딥러닝이며, BackPropagation 알고리즘을 사용합니다.

BackPropagation (역전파)

자세한 강의는 링크를 참고하세요.

Deep Learning은 기본적으로 여러개의 Layer를 쌓은 후 학습하여, 실제 Y값과 예측 Y값의 차이 ( Cost )를 구하고 다시 처음 과정으로 돌아가서 변수를 재조정하는 과정을 반복합니다.

이는 곧 Cost Minimize와 같습니다.

이 알고리즘은 우리가 직접 구현할 필요 없이 텐서플로우 내부에서 자동적으로 적용되어 값이 도출됩니다.

따라서 우리는 Layer만 쌓으면 되는데, 이것은 다음과 같이 구성할 수 있습니다.

기존

1
2
3
W = tf.Variable(tf.random_normal([2, 2]), name='weight')
b = tf.Variable(tf.random_normal([2]), name='bias')
hypothesis = tf.sigmoid(tf.matmul(X, W1) + b1)
cs

NeuralNetwork (딥러닝)

1
2
3
4
5
6
7
8
W1 = tf.Variable(tf.random_normal([2, 2]), name='weight1')
b1 = tf.Variable(tf.random_normal([2]), name='bias1')
layer1 = tf.sigmoid(tf.matmul(X, W1) + b1)
 
W2 = tf.Variable(tf.random_normal([2, 1]), name='weight2')
b2 = tf.Variable(tf.random_normal([1]), name='bias2')
 
hypothesis = tf.sigmoid(tf.matmul(layer1, W2) + b2)
cs

코드 자체는 굉장히 단순합니다.

W1과 b1으로 layer1을 구성하고, 이를 새로운 X 값으로 생각하고 W2와 다시 매트릭스 곱 및 sigmoid 함수를 이용하여 새로운 hypothesis를 구성합니다.

( sigmoid - y_data는 0과 1로만 구성되어있으므로 Logistic Classification )

주의해야 할 점은 W1에서 보이듯이 출력값이 2개라면 W2가 받는 X데이터의 값 또한 2개여야 합니다.

단, W2의 출력값은 1로 제한되지 않습니다.

이제 XOR 문제를 해결해보겠습니다.

1
2
3
4
x_data = [[0, 0], [0, 1], [1, 0], [1, 1]]
 
y_data = [[0], [1], [1], [0]]
 
Colored by Color Scripter
cs

마찬가지로 y_data는 0과 1로만 구성되어있으므로 Logistic Classification의 Cost 함수를 이용합니다.

SOURCE CODE

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
import tensorflow as tf
import numpy as np
 
x_data = [[0, 0], [0, 1], [1, 0], [1, 1]]
y_data = [[0], [1], [1], [0]]
 
x_data = np.array(x_data, dtype=np.float32)
y_data = np.array(y_data, dtype=np.float32)
 
X = tf.placeholder(tf.float32, [None, 2])
Y = tf.placeholder(tf.float32, [None, 1])
 
W1 = tf.Variable(tf.random_normal([2, 2]), name='weight1')
b1 = tf.Variable(tf.random_normal([2]), name='bias1')
layer1 = tf.sigmoid(tf.matmul(X, W1) + b1)
 
W2 = tf.Variable(tf.random_normal([2, 1]), name='weight2')
b2 = tf.Variable(tf.random_normal([1]), name='bias2')
 
hypothesis = tf.sigmoid(tf.matmul(layer1, W2) + b2)
 
cost = -tf.reduce_mean(Y * tf.log(hypothesis) + (1 - Y) *
                       tf.log(1 - hypothesis))
 
train = tf.train.GradientDescentOptimizer(learning_rate=0.1).minimize(cost)
 
# Accuracy computation
# True if hypothesis > 0.5 else False
 
predicted = tf.cast(hypothesis > 0.5, dtype=tf.float32)
accuracy = tf.reduce_mean(tf.cast(tf.equal(predicted, Y), dtype=tf.float32))
 
# Launch graph
with tf.Session() as sess:
    # Initialize TensorFlow variables
    sess.run(tf.global_variables_initializer())
 
    for step in range(10001):
        sess.run(train, feed_dict={X: x_data, Y: y_data})
        if step % 100 == 0:
            print(step, sess.run(cost, feed_dict={
                  X: x_data, Y: y_data}), sess.run([W1, W2]))
 
    # Accuracy report
    h, c, a = sess.run([hypothesis, predicted, accuracy],
                       feed_dict={X: x_data, Y: y_data})
    print("\nHypothesis: ", h, "\nCorrect: ", c, "\nAccuracy: ", a)
cs

결과

1
2
3
4
5
6
7
8
9
10
11
Hypothesis:  
[[0.01115024]
 [0.9874386 ]
 [0.98751473]
 [0.01938552]] 
Correct:  
[[0.]
 [1.]
 [1.]
 [0.]] 
Accuracy: 1.0
cs

학습 결과 다음과 같이 정확하게 결과를 예측했습니다.

즉, Neural Network는 Layer를 deep하게 쌓은 후 학습한다는 의미에서 Deep Learning이라고 합니다.

레이어는 다음과 같이 여러 개를 쌓을 수 있으며, 중요한 것은 항상 이전 레이어의 Y 출력값과 다음 W 변수의 X 입력값이 같아야 한다는 점입니다.

마찬가지로 최종 Y 출력값에는 제한이 없습니다.

[모델1] : DEEP

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
W1 = tf.Variable(tf.random_normal([2, 10]), name='weight1')
b1 = tf.Variable(tf.random_normal([10]), name='bias1')
layer1 = tf.sigmoid(tf.matmul(X, W1) + b1)
 
W2 = tf.Variable(tf.random_normal([10, 10]), name='weight2')
b2 = tf.Variable(tf.random_normal([10]), name='bias2')
layer2 = tf.sigmoid(tf.matmul(layer1, W2) + b2)
 
W3 = tf.Variable(tf.random_normal([10, 10]), name='weight3')
b3 = tf.Variable(tf.random_normal([10]), name='bias3')
layer3 = tf.sigmoid(tf.matmul(layer2, W3) + b3)
 
W4 = tf.Variable(tf.random_normal([10, 1]), name='weight4')
b4 = tf.Variable(tf.random_normal([1]), name='bias4')
hypothesis = tf.sigmoid(tf.matmul(layer3, W4) + b4)
cs

[모델2] : WIDE

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
W1 = tf.Variable(tf.random_normal([2, 248]), name='weight1')
b1 = tf.Variable(tf.random_normal([248]), name='bias1')
layer1 = tf.sigmoid(tf.matmul(X, W1) + b1)
 
W2 = tf.Variable(tf.random_normal([248, 48]), name='weight2')
b2 = tf.Variable(tf.random_normal([48]), name='bias2')
layer2 = tf.sigmoid(tf.matmul(layer1, W2) + b2)
 
W3 = tf.Variable(tf.random_normal([48, 10]), name='weight3')
b3 = tf.Variable(tf.random_normal([10]), name='bias3')
layer3 = tf.sigmoid(tf.matmul(layer2, W3) + b3)
 
W4 = tf.Variable(tf.random_normal([10, 15]), name='weight4')
b4 = tf.Variable(tf.random_normal([15]), name='bias4')
hypothesis = tf.sigmoid(tf.matmul(layer3, W4) + b4)
cs

저작자표시

'파이썬 > 머신러닝' 카테고리의 다른 글

[#E12] W 변수 (Weights) 초기화 방법 (0)	2018.12.10
[#E11] Vanishing Gradient와 ReLU (0)	2018.12.10
[#E9] 학습 데이터 전처리 (PreProcessing) (0)	2018.12.08
[#E8] Fancy Softmax Regression이란? (0)	2018.12.05
[#E7] 텍스트 파일로부터 데이터를 읽어오는 방법 (0)	2018.12.05

Trendy Develope

[#E10] 딥러닝의 기초 XOR 문제란?

'파이썬 > 머신러닝' 카테고리의 다른 글

티스토리툴바

[#E10] 딥러닝의 기초 XOR 문제란?

'파이썬 > 머신러닝' 카테고리의 다른 글

'파이썬/머신러닝' Related Articles

티스토리툴바