Python基础教程1--基本概念、序列和字符串

基本概念
- 变量
- 注释
- 语句
- 函数
- 方法
- print函数
- 获取用户输入
- 数字
- None
序列
字符串
- 字符串定义
- 字符串方法

Python基础教程系列文章，通过由浅入深的介绍，带领大家进入Python的世界。本系列文章基于Python3版本。

本文是第一篇，主要介绍Python入门相关知识，主要包括基本概念、序列和字符串相关知识，让大家对Python和编程有个初步认识。

基本概念

变量(variable)

变量是代表某个值的名字，通过赋值操作，可以将值赋给变量，以后就可以在表达式中使用变量。

变量名可以包括字母，数字，下划线，而且不能以数字开头

                
                    pi = 3.14
                    pi * 2 = 6.28

注释

Python使用井号#添加注释。

                
                    # This is a comment.

语句

语句就是做某件事(动作)，以下都是语句。

                
                    >>> print(2*2)          # 打印语句
                    >>> PI = 3.14           # 赋值语句
                    >>> import math         # 导入语句
                    >>> if(age>18):         # 条件语句
                    >>> while(age>18):      # 循环语句

函数(function)

函数是经过封装，用来实现特定功能的代码集合，调用函数会返回结果给用户，函数也可以接收参数。

                
                    >>>pow(2,3)     # 幂运算
                    8
                    >>>abs(-2)      # 绝对值运算
                    2

方法(method)

方法是与某些对象紧密联系的函数，对象可能是列表/字典/字符串/等等。方法的一般调用方式是对象.方法(参数)

                
                    >>>lst = [1, 2]
                    >>>lst.append(3)
                    >>>lst
                    [1, 2, 3]

print()函数

print(value1, value2, ..., valueN, sep=' ', end='\\n')函数可以将内容打印到控制台上，支持一次打印多个值(默认用' '隔开)，经常用于代码跟踪调试。

                
                    >>> print('Hello, World!!!')
                    Hello, World!!!
                    >>> print('Hello,','World!!!')
                    Hello, World!!!

在Python2中是使用 print 'Hello, World!!!' 的方式(没有括号)进行打印，跟Python3中有所差异。

获取用户输入

可以通过input函数获取用户输入

                
                    >>> input('Please enter a number:')
                    Please enter a number:>? 12
                    '12'                                # 输入的值都是字符串，可以再转换为其他类型使用

Python3中取消了raw_input函数，统一用input

数字(number)

类型

数字分为整数int，浮点数float和复数complex。

请注意: Python3中整型int没有长度限制，没有Long长整型。

示例:

                        
                            int_var = 1             # 整数
                            float_var = 1.2         # 浮点数
                            complex_var = 1 + 2j    # 复数

                            int_oct_var = 0o12      # 8进制，=10
                            int_hex_var = 0x1A      # 16进制，=16
                            int_bin_var = 0b1000    # 2进制，=8

运算

使用+, -, * 和 /进行加减乘除操作。

浮点数在计算机中是以二进制的方式表示的，大多数的十进制小数都不能精确地表示为二进制小数，所以浮点数运算，结果可能为近似值

基本运算，如果参与运算的数中有浮点数，则结果亦为浮点数。

                                
                                    3 + 3 = 6
                                    3.0 + 3 = 6.0

                                    6 - 3 = 3
                                    6.0 - 3 = 3.0

                                    10 - 3 * 3 = 1
                                    10 - 3.0 * 3 = 1.0

                                    (10 - 3) * 3 = 21

                                    0.1 + 0.2 = 0.30000000000000004     # 近似值

除法/，无论是否整除，结果都是浮点数。

                                
                                    6 / 3 = 2.0
                                    3 / 2 = 1.5
                                    3.0 / 2 = 1.5

取整//，获取除法结果中的整数部分，如果有浮点数参与运算，取整结果为n.0。

                                
                                    7 // 2 = 3
                                    7.0 // 2 = 3.0

取余%，获取除法结果中的余数(整数/小数)部分，如果有浮点数参与运算，取余结果为n.0。

                                
                                    7 % 2 = 1
                                    7.0 % 2 = 1.0
                                    7.2 % 2 = 1.2   # 结果为1.2000000000000002

幂(乘方)**

                                
                                    2 ** 3 = 8
                                    -3 ** 2 = -9    # 幂运算**的优先级高于取反-，相当于-(3 ** 2)
                                    (-3) ** 2 = 9

常用函数
- abs(n)--取绝对值，abs(-10) = 10
- pow(x,y[,z])--幂操作，结果为x的y次方，pow(2,3)=8(2*2*2) / pow(2,3,5)=3(2*2*2%5)
- round(n)--返回四舍五入结果，round(1.2)=1 / round(1.6)=2/ round(-1.6)=-2
- divmod(x,y)--除模操作，返回一个元组(x//y,x%y)，divmod(8,3)=(2,2)(8//3,8%3)

None

None是Python中的特殊值，表示没有值，或者是空值，None不能理解为0或者''，它代表的是空。

                
                    >>> None == 0
                    False
                    >>> None == ''
                    False

序列

序列是一组的数据元素的集合，这些元素可以是数字，字符串，或其它数据结构，序列中的每个元素被分配一个数字序号--元素的位置(索引)，索引从0开始。

内建序列

Python中有6种内建序列: 列表/元组/字符串/Unicode字符串/buffer对象/xrange/对象。

本章中的代码都是用列表做的示例

序列通用操作

索引: 可以通过索引编号访问序列中的元素。

                        
                            >>> n_list = [11, 22, 33, 44, 55, 66]
                            >>> n_list[0]
                            11
                            >>> greeting = 'Hello'
                            >>> greeting[1]
                            'e'

索引可以为负数，最后一个元素的编号是-1

                        
                            >>> n_list = [11, 22, 33, 44, 55, 66]
                            >>> n_list[-1]
                            66
                            >>> greeting = 'Hello'
                            >>> greeting[-1]
                            'o'

请注意索引不能超出序列的长度范围，否则会产生异常

                        
                            >>> n_list = [11, 22, 33, 44, 55, 66]
                            >>> n_list[10]
                            IndexError: list index out of range

元素赋值: 可以将指定索引对应位置的元素设置为新的元素。

                        
                            >>> n_list = [11, 22, 11]
                            >>> n_list[2] = 33
                            >>> n_list
                            [11, 22, 33]

请注意，不能为一个位置不存在的元素进行赋值，即索引必须小于序列的长度。

删除元素: 可以使用 del 语句删除对应位置的元素，删除以后序列的长度会发生变化(-1)。

                        
                            >>> n_list = [11, 22, 33]
                            >>> del n_list[2]           # 将第2个索引元素(33)删除
                            >>> n_list
                            [11, 22]

请注意，不能删除一个位置不存在的元素，即索引必须小于序列的长度。

序列相加

两个相同类型的序列相加会返回一个新的序列

                        
                            >>> [1, 2, 3] + [4, 5, 6]
                            [1, 2, 3, 4, 5, 6]          # 新列表
                            >>> 'Hello, '+'world!!!'
                            'Hello, world!!!'           # 新字符串

序列乘以数字x

生成一个将原来序列重复x次的新序列。

                        
                            >>> [1] * 2
                            [1, 1]
                            >>> [1, 2] * 2
                            [1, 2, 1, 2]
                            >>> 'python' * 2
                            'pythonpython'

成员资格 in

使用 in 运算符来检查值是否存在于序列中，返回布尔值True/False。

                        
                            >>> 1 in [1, 2, 3]
                            True
                            >>> 6 in [1, 2, 3]
                            False
                            >>> 'H' in 'Hello'
                            True
                            >>> 'h' in 'Hello'
                            False                               # 区分大小写
                            >>> [1, 2] in [1, 2, 3, [1, 2]]     # 多维数组
                            True

一般来说，in运算符会检查一个对象是否为序列的成员，比如上述例子。

不过有个例外，对于字符串，可以检查多个字符是否是字符串的子串，如下所示:

                        
                            >>> 'He' in 'Hello'
                            True

长度/最大值/最小值

可以通过内置方法len()/max()/min()分别获取序列的长度/最大值/最小值。

                        
                            >>> n_list = [11, 22, 33, 44, 55, 66]
                            >>> len(n_list)
                            6
                            >>> max(n_list)
                            66
                            >>> min(n_list)
                            11

序列分片

分片用来操作一定范围内的元素，可以通过分片进行序列的增/删/改/查等操作。

分片语法为: [ firstIncludedIndex : lastNotIncludedIndex ]。

firstIncludedIndex：要提取的第一个元素的编号，索引的元素包含在分片之内。
lastNotIncludedIndex：要提取的最后一个元素的编号，索引的元素不包含在分片之内。

元素提取/查询

                        
                            >>> n_list = [11, 22, 33, 44, 55, 66]
                            >>> n_list[2:5]         # 包含第2个(33)，不包含第5个(66)
                            [33, 44, 55]

如果分片中左侧的索引比右侧的索引晚出现在序列中，结果为空序列。

                        
                            >>> n_list = [11, 22, 33, 44, 55, 66]
                            >>> n_list[-3:0]            # 倒数第三个比第一个晚出现(右侧)
                            []

firstIncludedIndex和lastNotIncludedIndex索引都可省略

如果省略firstIncludedIndex，则表示从序列开始的第一个元素到lastNotIncludedIndex的分片。
如果省略lastNotIncludedIndex，则表示从firstIncludedIndex到序列结束的最后一个元素的分片。
如果两者都省略，则表示从序列第一个到最后一个元素的分片，相当于复制整个序列。

                        
                            >>> n_list = [11, 22, 33, 44, 55, 66]
                            >>> n_list[:3]              # 开始-->3，不包含索引3
                            [11, 22, 33]
                            >>> n_list[2:]              # 2-->结束，包含索引2
                            [33, 44, 55, 66]
                            >>> n_list[-3:]             # -3-->结束，包含索引-3
                            [44, 55, 66]
                            >>> n_list[:]               # 开始-->结束
                            [11, 22, 33, 44, 55, 66]

步长

分片操作的默认步长是1，逐个遍历序列的元素，然后返回。这个步长是可以设置的，表示将每步长(step)个元素的第一个提取出来，语法如下:

                        
                            >>> n_list = [11, 22, 33, 44, 55, 66]
                            >>> n_list[1:5:1]               # 相当于 n_list[1:5]
                            [22, 33, 44, 55]
                            >>> n_list[1:5:2]               # 将每2个元素的第一个提取出来
                            [22, 44]
                            >>> n_list[::2]                 # 将每2个元素的第一个提取出来
                            [11, 33, 55]

步长不能为0，但是可以为负数，此时分片会从右到左提取元素，注意firstIncludedIndex和lastNotIncludedIndex的包含关系以及要firstIncludedIndex>lastNotIncludedIndex。

                        
                            >>> n_list = [11, 22, 33, 44, 55, 66]
                            >>> n_list[5:1:-1]              # 包含第五个(6)，不包含第一个(2)
                            [66, 55, 44, 33]
                            >>> n_list[5:1:-2]
                            [66, 44]
                            >>> n_list[::-2]
                            [66, 44, 22]

分片赋值

可以通过分片赋值，对序列进行替换/插入/删除元素操作。请注意字符串是不可变的，不能进行分片赋值操作。

虽然可以通过分片赋值修改序列，但是因为可读性不好，所以一般不建议使用，可以通过对应的功能函数来修改序列。

替换

可以通过分片赋值进行序列元素的批量替换，替换操作被替换的分片长度和替换上的序列，长度可以不一致。

                                
                                    >>> names = ['P', 'e', 'r', 'l']
                                    >>> names[1:] = ['y', 't', 'h', 'o', 'n']   # 将第1个元素-->结束的元素替换
                                    >>> names
                                    ['P', 'y', 't', 'h', 'o', 'n']

插入

分片赋值可以在不替换任何原有元素的情况下，插入新的元素，只需要firstIncludedIndex==lastNotIncludedIndex即可，相当于替换了一个空的分片。

                                
                                    >>> n_list = [11, 22, 55, 66]
                                    >>> n_list[2:2] = [33, 44]              # 在第2个索引(55)前插入新元素
                                    >>> n_list
                                    [11, 22, 33, 44, 55, 66]
                                    >>> n_list[len(n_list):] = [77, 88]     # 在末尾插入新元素，len(n_list)可以替换成任意大于序列长度的数字
                                    >>> n_list
                                    [11, 22, 33, 44, 55, 66, 77, 88]

删除

分片赋值可以将序列的原有元素替换成一个空序列来实现删除操作。

                                
                                    >>> n_list = [11, 22, 33, 44, 55, 66]
                                    >>> n_list[2:4] = []                    # 将第2个索引(33)-->第4个索引(55)之前的元素(33, 44)删除
                                    >>> n_list
                                    [11, 22, 55, 66]

字符串(str)

字符串顾名思义就是一串字符，或者说是文本片段，是编程中最常用到的数据类型之一。

字符串也是一种序列，所以所有标准的序列操作(索引、分片、乘法、判断成员资格、计算长度、取最小值和最大值)，对于字符串同样适用。

但是，有一点需要注意，字符串是不可变的，所以元素赋值/分片赋值都是不合法的。

                
                    >>> name = 'Python2'
                    >>> name[6] = '3'
                    TypeError: 'str' object does not support item assignment

字符串定义

一般使用单引号''，双引号""表示字符串。

                        
                            #单引号
                            str1 = 'This is a string'
                            #双引号
                            str2 = "This is a string"

可以使用三引号(单/双)'''/"""可用来表示可换行长字符串

                        
                            #三单引号
                            str1 = '''This is a long string
                                    with newlines'''
                            #三双引号
                            str2 = """This is a long string
                                    with newlines"""

长字符串经常用于编写函数文档，也可以用作注释。

                        
                            def is_ajax():
                                """
                                Check if the request is an ajax request.
                                :return:
                                """
                                return request.headers.get('X-Requested-With') == 'XMLHttpRequest'

引号转义

在单引号字符串中可以包含双引号，在双引号字符串中可以包含单引号。

                        
                            str1 = 'This is a "string"'     # 在单引号字符串中可以包含双引号
                            str2 = "This is a 'string'"     # 在双引号字符串中可以包含单引号

不能在单引号字符串中可以包含单引号，不能在双引号字符串中可以包含双引号，以下写法为错误示例。

                        
                            >>> err_str1 = 'Let's go!'
                            SyntaxError: invalid syntax
                            >>> str2 = "This is a "string""
                            SyntaxError: invalid syntax

如果想在单引号字符串中可以包含单引号，或在双引号字符串中可以包含双引号，可以使用反斜线 \ 进行转义。

                        
                            str1 = 'Let\'s go!'
                            str2 = "This is a \"string\""

字符串拼接

多个字符串可以通过加号 + 进行拼接。

                        
                            str1 = 'Hello, '+'world!!!'     # 'Hello, world!!!'

请注意字符串不能跟非字符串拼接，如果要拼接非字符串，可以先将非字符串转为字符串，然后再进行拼接操作。

str()和repr()是将值转换为字符串的两种方式，Python3中反引号(``)不再使用。

                        
                           str1 = 'My age is '+str(18)      # 'My age is 18'
                           str2 = 'My age is '+repr(18)     # 'My age is 18'

大部分情况下，str()和repr()的结果一样，一个主要区别是，如果参数值是字符串，repr()的结果带引号(标识字符串类型)，而str()的结果不带引号

                        
                           str1 = 'My age is '+str('18')    # 'My age is 18'
                           str2 = 'My age is '+repr('18')   # "My age is '18'"

字符串方法

以下是字符串中经常用到的一些方法

find(sub[, start[, end]])

可以在一个字符串中查找子串，它返回第一次找到匹配的子字串所在位置的最左端索引，如果没有找到则返回-1，如果返回0则证明在索引0位置找到了子串。

                        
                            >>> str1 = 'This is a string'
                            >>> str1.find('is')         # 子字串所在位置最左端索引
                            2
                            >>> str1.find('This')       # 索引0位置找到了子串
                            0
                            >>> str1.find('IS')         # 区分大小写，没找到返回-1
                            -1

find方法还可以接收可选的起始点和结束点参数

                        
                            >>> str1 = '## This is a string ##'
                            >>> str1.find('##')             # 默认没有起始点和结束点
                            0
                            >>> str1.find('##',1)           # 提供起始点
                            20
                            >>> str1.find('##', 1, 16)      # 提供起始点和结束点
                            -1

startswith(prefix[, start[, end]]) / endswith(suffix[, start[, end]])

可以用于检查字符串是否以某些指定的字符开始或结束。

                        
                            >>> txt = 'Hello, welcome to my world'
                            >>> txt.startswith('Hello')         # 检查字符串起始位置的子字符串是否是'Hello'，等价于 txt[:5] == 'Hello'
                            True
                            >>> txt.startswith('world')
                            False

                            >>> txt.endswith('world')           # 检查字符串结束位置的子字符串是否是'world'，等价于 txt[-5:] == 'world'
                            True
                            >>> txt.endswith('World')           # 区分大小写
                            False

如果想检查多种匹配可能，只需将所有的匹配项放入到一个元组中去，然后传给startswith()或者 endswith()方法

                        
                            >>> 'http://zhangyiheng.com'.startswith(('http','https'))   #检查字符串是否以'http'或'https'开始
                            True
                            >>> 'http://zhangyiheng.com'.endswith(('com','site'))       #检查字符串是否以'com'或'site'结尾
                            True

请注意，如果是多个匹配，参数必须是元组类型，如果是其它的序列类型(list/set)，必须要通过tuple()转换以后作为参数。

join(iterable)

可以用来连接序列中的元素，返回一个字符串。请注意，需要连接的各个元素必须都是字符串，否则会引发异常

                        
                            >>> str_list = ['This', 'is', 'a', 'string']
                            >>> ' '.join(str_list)          # 将各个元素通过' '连接，返回连接以后的字符串
                            'This is a string'
                            >>> int_list = [1, 2, 3]
                            >>> "+".join(int_list)          # 非字符串元素连接，引发异常
                            TypeError: sequence item 0: expected str instance, int found

split(sep = None)

可以将字符串分割成序列并返回。如果不提供分隔符，程序会把所有空格作为分隔符(空格、制表、换行等)。

请注意，sep分隔符不能传空字符串('')，否则会引发异常。

                        
                            >>> ip = '10.79.112.206'
                            >>> ip.split('.')               # 字符串以分隔符(.)进行分割
                            ['10', '79', '112', '206']
                            >>> ip.split('')                # 分隔符不能为''
                            ValueError: empty separator

string对象的split()方法只适应于非常简单的字符串分割情形，它不允许有多个分隔符或者空格。

当你需要更加灵活的切割字符串的时候，最好使用正则表达式的re.split()方法

                        
                            >>> import re
                            >>> line = 'asdf fjdk; afed, fjek,asdf, foo'
                            >>> fields = re.split(r'(?:,|;|\s)\s*', line)
                            >>> fields
                            ['asdf', 'fjdk', 'afed', 'fjek', 'asdf', 'foo']

strip([chars])/lstrip([chars])/rstrip([chars])

用于返回去除两侧/左侧/右侧空白(或指定字符)的字符串。需要注意，操作只针对两端，不会去除中间的空格。

                        
                            >>> '     This is a string     \n\r\t'.strip()      # 去除两侧空白字符，除了空格，也会去除\n\r\t等空白字符
                            'This is a string'
                            >>> '     This is a string     \n\r\t'.lstrip()     # 去除左侧空白字符
                            'This is a string     \n\r\t'
                            >>> '     This is a string     \n\r\t'.rstrip()     # 去除右侧空白字符
                            '     This is a string'
                            >>> '------This is a string======'.strip('-=')      # 去除指定字符
                            'This is a string'

replace(old, new[, count])

返回将字符串中的所有指定匹配项替换成替换项以后的字符串。功能类似我们经常用到的"查找并替换"

                        
                            >>> 'This is a string'.replace('is','eez')
                            'Theez eez a string'

lower()/upper()

返回小写/大写模式的字符串，如果要编写不区分大小写的代码，这两个方法会非常有用

                        
                            >>> 'Python'.lower()    # 小写
                            'python'
                            >>> 'Python'.upper()    # 大写
                            'PYTHON'

center(width[, fillchar])/ljust(width[, fillchar])/rjust(width[, fillchar])

用于返回指定长度，并且文字居中/向左对齐/向右对齐的字符串。

                        
                            >>> 'Hello World'.center(20,'*')    # 返回一个长度为20，并且文字在中间，其余位置用*填充的字符串
                            '****Hello World*****'
                            >>> 'Hello World'.ljust(20,'*')     # 返回一个长度为20，并且文字向左对齐，其余位置用*填充的字符串
                            'Hello World*********'
                            >>> 'Hello World'.rjust(20,'*')     # 返回一个长度为20，并且文字向右对齐，其余位置用*填充的字符串
                            '*********Hello World'
                            >>> 'Hello World'.center(6,'*')     # 如果指定长度小于或等于原字符串长度，返回原字符串
                            'Hello World'

title()

将字符串转换为标题字符串，所有单词的首字母大写，其它字母小写。

                        
                            >>> str1 = 'this is a string'
                            >>> str1.title()
                            'This Is A String'

index(value[, start, end])

返回子字符串开始的索引值，如果不包含子字符串，会抛出异常。

                        
                            >>> str1 = 'this is a string'
                            >>> str1.index('is')
                            2
                            >>> str1.index('is',2)  # 从第2个位置开始，包含指定的位置
                            2
                            >>> str1.index('is',3)  # 从第3个位置开始
                            5
                            >>> str1.index('xx')    # 不包含，抛出异常
                            ValueError: substring not found

rindex(value[, start, end])

返回子字符串在字符串中最后出现的位置，如果不包含子字符串，会抛出异常。

                        
                            >>> str1 = 'this is a string'
                            >>> str1.rindex('is')
                            5
                            >>> str1.rindex('xx')
                            ValueError: substring not found

count(value[, start, end])

返回给定字符串中子字符串的出现次数，如果不包含字串，返回0

                        
                            >>> str1 = 'this is a string'
                            >>> str1.count('is')
                            2
                            >>> str1.rindex('xx')
                            0