本人小菜鸟,也就是有点闲做了个写小说人工智能而已,RNN架构,欢迎大家自己跑一下(显存2G或没有独显的同志请做好迎接电脑爆炸的决心)恳请站长指导!

Mikoto 2023-09-09 23:09:22 2023-09-09 23:12:33 4
                                                  声明
本项目基于pytorch实现。可以自行更换训练文本,但请记住:训练文本一定一定得是TXT格式!!!
另外,本项目仅用作学习交流,严禁用于商用、参加任何形式以及团体组织的比赛,请使用者严格遵守Apache License Version 开源协议
作者不承担此程序造成的一切后果与责任。
版权归作者重庆市小学生胡哲涵所有,Github likehuiyuanai。
PS:最近喜欢看某科学的超电磁炮,所以实例代码用的是日本轻小说《魔法禁书目录》作为训练数据,暂拒绝公开此数据集(你可以自己整理,字数多亿些训练效果好点
本博客(帖子)的版权也同样归胡哲涵所有,严禁转发或参加比赛!

正式开始(暂时没有开发代码生成器的意向,毕竟我不喜欢。没有bug的代码等同于人失去了灵魂!)

import torch
import torch.nn as nn
import numpy as np
import matplotlib.pyplot as plt
import time
from scipy.sparse import csr_matrix
from tensorboardX import SummaryWriter
%matplotlib inline

导入库,幼儿园都学过吧。

with open('./mfjsml.txt', 'r', encoding='utf-8') as f:
    data = f.readlines()

读取数据集,mfjsml这个名字可以改,但数据集和jupyter lab代码放一个文件夹下!

data=''.join(data)
print(data[:100])

展示数据集的一部分

chars = list(set(data))
data_size, vocab_size = len(data), len(chars)
print(f'data has {data_size} characters, {vocab_size} unique.')
char_to_ix = { ch:i for i,ch in enumerate(chars) }
ix_to_char = { i:ch for i,ch in enumerate(chars) }

接下来我们开始构建LSTM模型

X_train = csr_matrix((len(data), len(chars)), dtype=np.int)
char_id = np.array([chars.index(c) for c in data])
X_train[np.arange(len(data)), char_id] = 1
y_train = np.roll(char_id,-1)
X_train.shape
y_train.shape
def get_batch(X_train, y_train, seq_length):
    '''Return a training batch with certain number of X and y pairs.'''
    X = X_train
    #X = torch.from_numpy(X_train).float()
    y = torch.from_numpy(y_train).long()
    for i in range(0, len(y), seq_length):   
        id_stop = i+seq_length if i+seq_length < len(y) else len(y)
        yield([torch.from_numpy(X[i:id_stop].toarray().astype(np.float32)), 
               y[i:id_stop]])
def sample_chars(rnn, X_seed, h_prev, length=20):
    '''Generate text using trained model'''
    X_next = X_seed
    results = []
    with torch.no_grad():
        for i in range(length):        
            y_score, h_prev = rnn(X_next.view(1,1,-1), h_prev)
            y_prob = nn.Softmax(0)(y_score.view(-1)).detach().numpy()
            y_pred = np.random.choice(chars,1, p=y_prob).item()
            results.append(y_pred)
            X_next = torch.zeros_like(X_seed)
            X_next[chars.index(y_pred)] = 1
    return ''.join(results)
class nn_LSTM(nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super().__init__()
        self.hidden_size = hidden_size
        self.lstm = nn.LSTM(input_size, hidden_size)
        self.out = nn.Linear(hidden_size, output_size)
        
    def forward(self, X, hidden):
        _, hidden = self.lstm(X, hidden)
        output = self.out(hidden[0])
        return output, hidden
    
    def initHidden(self):
        return (torch.zeros(1, 1, self.hidden_size),
                torch.zeros(1, 1, self.hidden_size)
               )
hidden_size = 256
seq_length = 25
rnn = nn_LSTM(vocab_size, hidden_size, vocab_size)
loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(rnn.parameters(), lr=0.005)
def train(X_batch, y_batch):
    h_prev = rnn.initHidden()
    optimizer.zero_grad()
    batch_loss = torch.tensor(0, dtype=torch.float)
    
    for i in range(len(X_batch)):
        y_score, h_prev = rnn(X_batch[i].view(1,1,-1), h_prev)
        loss = loss_fn(y_score.view(1,-1), y_batch[i].view(1))
        batch_loss += loss
    batch_loss.backward()
    optimizer.step()

    return y_score, batch_loss/len(X_batch)
writer = SummaryWriter(f'logs/lstm1_{time.strftime("%Y%m%d-%H%M%S")}')

准备好了吗?所有CUDA/Tensor核心全部启动启动还有这个,启动训练!

all_losses = []
print_every = 100
for epoch in range(20):    
    for batch in get_batch(X_train, y_train, seq_length):
        X_batch, y_batch = batch
        _, batch_loss = train(X_batch, y_batch)
        all_losses.append(batch_loss.item())
        if len(all_losses)%print_every==1:
            print(f'----\nRunning Avg Loss:{np.mean(all_losses[-print_every:])} at iter: {len(all_losses)}\n----')
            # log to tensorboard every X iterations. Can be removed if Tensorboard is not installed.
            writer.add_scalar('loss', np.mean(all_losses[-100:]), len(all_losses))
            # generate text every X iterations
            print(sample_chars(rnn, X_batch[0], rnn.initHidden(), 200))

终于可以启动生成器了

print(sample_chars(rnn, X_batch[20], rnn.initHidden(), 200))
torch.save(rnn.state_dict(), 'shediao.pth')

把模型存着。你自己训练的模型你自己可以随便搞,作者对此没有版权和责任 另外,推荐大家去看看某科学的超电磁炮,特别好看!要联系的去我的主页加我邮箱

{{ vote && vote.total.up }}

共 5 条回复

Even226 逗比

我才10岁,我就是学前班的了

Even226 逗比

萌新

Even226 逗比

你都12岁了还说你是

Even226 逗比

6

Mikoto

做的怎么样?本人觉得在人工智能邻域就是幼儿园水平,不过本人也在努力学习新知识新技术,每天以一个12岁六年级小学生的身份硬啃外语论文和高等数学,我觉得知识都不难,只要是自己喜欢的事情,哪怕造火箭都不困难。对了,大家运行代码报错就来回复我,我尽量全部解答。也提前祝大家1024快乐,学业有成!同志们加油!