ANN逻辑回归多分类【keras深度学习】鸢尾花(iris)

介绍

Iris 鸢尾花数据集是一个经典数据集，在统计学习和机器学习领域都经常被用作示例。数据集内包含 3 类共 150 条记录，每类各 50 个数据，每条记录都有 4 项特征：花萼长度、花萼宽度、花瓣长度、花瓣宽度，可以通过这4个特征预测鸢尾花卉属于（iris-setosa, iris-versicolour, iris-virginica）中的哪一品种。

import tensorflow as tf
from tensorflow import keras

from sklearn.model_selection import train_test_split

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

1. 导入数据集：

1 2	datasets = pd.read_csv('./input/Iris.csv') datasets

	Id	SepalLengthCm	SepalWidthCm	PetalLengthCm	PetalWidthCm	Species
0	1	5.1	3.5	1.4	0.2	Iris-setosa
1	2	4.9	3.0	1.4	0.2	Iris-setosa
2	3	4.7	3.2	1.3	0.2	Iris-setosa
3	4	4.6	3.1	1.5	0.2	Iris-setosa
4	5	5.0	3.6	1.4	0.2	Iris-setosa
...	...	...	...	...	...	...
145	146	6.7	3.0	5.2	2.3	Iris-virginica
146	147	6.3	2.5	5.0	1.9	Iris-virginica
147	148	6.5	3.0	5.2	2.0	Iris-virginica
148	149	6.2	3.4	5.4	2.3	Iris-virginica
149	150	5.9	3.0	5.1	1.8	Iris-virginica

150 rows × 6 columns

2. 准备数据：X

1 2	X = datasets[['SepalLengthCm', 'SepalWidthCm', 'PetalLengthCm', 'PetalWidthCm',]] X

	SepalLengthCm	SepalWidthCm	PetalLengthCm	PetalWidthCm
0	5.1	3.5	1.4	0.2
1	4.9	3.0	1.4	0.2
2	4.7	3.2	1.3	0.2
3	4.6	3.1	1.5	0.2
4	5.0	3.6	1.4	0.2
...	...	...	...	...
145	6.7	3.0	5.2	2.3
146	6.3	2.5	5.0	1.9
147	6.5	3.0	5.2	2.0
148	6.2	3.4	5.4	2.3
149	5.9	3.0	5.1	1.8

150 rows × 4 columns

3. 准备数据：y

#step1编码
#y两种处理方式：第一种是One hot编码。pandas.get_dummies(data, prefix=None, prefix_sep='_', dummy_na=False, columns=None, sparse=False, drop_first=False)[source]
#第二种是序列编码
y_oneHot=pd.get_dummies(datasets.Species)

#step2添加到原有数据集
#DataFrame.join(other, on=None, how=’left’, lsuffix=”, rsuffix=”, sort=False)
datasets=datasets.join(y_oneHot)

1	del datasets['Species']

1 2	y=datasets.iloc[:,-3:] y

	Iris-setosa	Iris-versicolor	Iris-virginica
0	1	0	0
1	1	0	0
2	1	0	0
3	1	0	0
4	1	0	0
...	...	...	...
145	0	0	1
146	0	0	1
147	0	0	1
148	0	0	1
149	0	0	1

150 rows × 3 columns

4. 划分测试集训练集

1 2	train_X, test_X, train_y, test_y = train_test_split(X, y, test_size = 0.2, random_state = 42) print(train_X.shape, train_y.shape, test_X.shape, test_y.shape)

(120, 4) (120, 3) (30, 4) (30, 3)

5. 搭建模型

model = keras.Sequential()

model.add(keras.layers.Dense(3,input_dim=4,activation='softmax'))
model.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 dense (Dense)               (None, 3)                 15        
                                                                 
=================================================================
Total params: 15
Trainable params: 15
Non-trainable params: 0
_________________________________________________________________

6. 编译模型

1	model.compile(optimizer='adam',loss='categorical_crossentropy',metrics=['acc'])

7. 训练模型

1	history=model.fit(train_X,train_y,epochs=500,validation_data=(test_X,test_y))

8. 评估模型

1	loss,acc = model.evaluate(train_X,train_y)

4/4 [==============================] - 0s 1ms/step - loss: 0.4869 - acc: 0.9333

1	loss,acc = model.evaluate(test_X,test_y)

1/1 [==============================] - 0s 25ms/step - loss: 0.4598 - acc: 0.9667

plt.plot(history.epoch,history.history.get('acc'),label = "acc")
plt.plot(history.epoch,history.history.get('loss'),label = "loss")
plt.xlabel('epochs')
plt.ylabel('loss/acc')
plt.legend()
plt.grid()

avatar

9. 预测模型

1	prediction = model.predict(test_X)

1 2	#查看第一个 np.argmax(prediction[0])

10. 存储模型

10.1 使用 h5py生成模型

1 2	import h5py model.save('model_iris.h5')

10.2 使用模型

1
2
3

from keras.models import load_model

my_model = load_model('model_iris.h5')

10.3 运用新模型预测

1 2	pred=model.predict(np.array([[5.5, 2.4, 3.7, 1. ]])) print(np.argmax(pred[0]))

下面的模型生成有报错信息，不知道什么原因？

11.1 使用 pickle存储模型

1 2	import joblib joblib.dump(model,'test.pkl')

INFO:tensorflow:Assets written to: ram://806e146a-aef3-4ddb-8f95-0e484b2bda19/assets





['test.pkl']

1	m=joblib.load('test.pkl')

import pickle
import joblib
file = open('iris_model3.pickle', 'wb')
joblib.dump(model, file)
file.close()

11.2 使用 pickle存储模型并利用gzip压缩

import pickle
import gzip
with gzip.GzipFile('iris_model.pgz', 'wb') as f:
    pickle.dump(model, f)

12. 载入模型

#注意模型预测输入必须为numpy形态，并且为二维矩阵格式

12 载入pickle模型

#讀取Model
file = open('iris_model.pkl', 'rb')
model = pickle.load(file,encoding='ASCII')
file.close()
pred=model.predict(np.array([[5.5, 2.4, 3.7, 1. ]]))
print(pred)

← Off-policy learning Git & GitHub使用指南 →

赏

使用支付宝打赏

使用微信打赏

若你觉得我的文章对你有帮助，欢迎点击上方按钮对我打赏

扫描二维码，分享此文章

	SepalLengthCm	SepalWidthCm	PetalLengthCm	PetalWidthCm
0	5.1	3.5	1.4	0.2
1	4.9	3.0	1.4	0.2
2	4.7	3.2	1.3	0.2
3	4.6	3.1	1.5	0.2
4	5.0	3.6	1.4	0.2
...	...	...	...	...
145	6.7	3.0	5.2	2.3
146	6.3	2.5	5.0	1.9
147	6.5	3.0	5.2	2.0
148	6.2	3.4	5.4	2.3
149	5.9	3.0	5.1	1.8

	Iris-setosa	Iris-versicolor	Iris-virginica
0	1	0	0
1	1	0	0
2	1	0	0
3	1	0	0
4	1	0	0
...	...	...	...
145	0	0	1
146	0	0	1
147	0	0	1
148	0	0	1
149	0	0	1

	SepalLengthCm	SepalWidthCm	PetalLengthCm	PetalWidthCm
0	5.1	3.5	1.4	0.2
1	4.9	3.0	1.4	0.2
2	4.7	3.2	1.3	0.2
3	4.6	3.1	1.5	0.2
4	5.0	3.6	1.4	0.2
...	...	...	...	...
145	6.7	3.0	5.2	2.3
146	6.3	2.5	5.0	1.9
147	6.5	3.0	5.2	2.0
148	6.2	3.4	5.4	2.3
149	5.9	3.0	5.1	1.8

	Iris-setosa	Iris-versicolor	Iris-virginica
0	1	0	0
1	1	0	0
2	1	0	0
3	1	0	0
4	1	0	0
...	...	...	...
145	0	0	1
146	0	0	1
147	0	0	1
148	0	0	1
149	0	0	1

	SepalLengthCm	SepalWidthCm	PetalLengthCm	PetalWidthCm
0	5.1	3.5	1.4	0.2
1	4.9	3.0	1.4	0.2
2	4.7	3.2	1.3	0.2
3	4.6	3.1	1.5	0.2
4	5.0	3.6	1.4	0.2
...	...	...	...	...
145	6.7	3.0	5.2	2.3
146	6.3	2.5	5.0	1.9
147	6.5	3.0	5.2	2.0
148	6.2	3.4	5.4	2.3
149	5.9	3.0	5.1	1.8

	Iris-setosa	Iris-versicolor	Iris-virginica
0	1	0	0
1	1	0	0
2	1	0	0
3	1	0	0
4	1	0	0
...	...	...	...
145	0	0	1
146	0	0	1
147	0	0	1
148	0	0	1
149	0	0	1