验证码—-初生

Posted by bfpiaoran on August 12, 2017

本来想说是验证码的今生前世的 however 今天刚看验证码识别
还是初生吧

pip install pytesseract pip install Pillow

然后下载http://code.google.com/p/tesseract-ocr谷歌的验证码识别

然后关键一步 踩了一小时坑 特喵的中文回答少之又少

在C:\Python34\Lib\site-packages\pytesseract目录下有个pytesseract.py文件

将tesseract_cmd = ‘tesseract’
改成tesseract_cmd = ‘C:/Program Files (x86)/Tesseract-OCR/tesseract.exe’

windows坑啊

然后就简单了

from PIL import Image
import pytesseract
from urllib.request import urlopen
import sys
 
url = sys.argv[1]
b = urlopen(url)
img = Image.open(b)
a = pytesseract.image_to_string(img)
print(a)

粗略的写了下

识别率还算很高