【Linux系统】常用命令：文本操作、重定向、管道、流、文件打包解压缩

2022-5-13

现如今各种场合都有使用各种 Linux 发行版，从嵌入式设备到超级计算机，并且在服务器领域确定了地位，对于一个开发人员来说，Linux的学习是必不可少的，对于简单操作Linux系统更是必备的技能，对于Linux系统的重要性在这里不多赘述。于是我在学习过程中，整理了一系列Linux常用命令，文章内容收集整理于网络，相关代码都是经过自己实现验证的，众所周知，Linux系统的相关命令是数不胜数的，学完以至于完全记住是很难的，在这里只是整理一部分常用、实用、后端开发人员必备的命令，当然，个人感觉这些命令不需要刻意去记忆，而是在反复的使用过程中达到孰能生巧的，所以本系列文章目的主要用于方便自己后期的查阅与复习，希望可以帮助到你！

前面主要学习了Linux系统的群组用户以及文件权限管理等相关命令。详情查阅：【Linux系统】常用命令：群组用户以及文件权限管理、查找文件、Linux的软件仓库 - 编程那点事儿 (imyjs.cn)

下面学习Linux系统中文本操作的进阶命令、重定向、管道、流、文件打包解压缩的相关知识。

1.文本操作

grep

全局搜索一个正则表达式，并且打印到屏幕。简单来说就是，在文件中查找关键字，并显示关键字所在行。

基础语法

grep text file # text代表要搜索的文本，file代表供搜索的文件

# 实例
[zhangsan@localhost ~]$ grep path /etc/profile
pathmunge () {
    pathmunge /usr/sbin
    pathmunge /usr/local/sbin
    pathmunge /usr/local/sbin after
    pathmunge /usr/sbin after
unset -f pathmunge
[zhangsan@localhost ~]$

常用参数

-i 忽略大小写， grep -i path /etc/profile
-n 显示行号，grep -n path /etc/profile
-v 只显示搜索文本不在的那些行，grep -v path /etc/profile
-r 递归查找， grep -r hello /etc ，Linux 中还有一个 rgrep 命令，作用相当于 grep -r

高级用法

grep 可以配合正则表达式使用。

grep -E path /etc/profile --> 完全匹配path
grep -E ^path /etc/profile --> 匹配path开头的字符串
grep -E [Pp]ath /etc/profile --> 匹配path或Path

sort

对文件的行进行排序。

基础语法

sort name.txt # 对name.txt文件进行排序

实例用法

为了演示方便，我们首先创建一个文件 sort.txt ，放入以下内容：

[zhangsan@localhost ~]$ vi sort.txt
[zhangsan@localhost ~]$ cat sort.txt 
apple
origin
red
blue
black
nick
Yjs

[zhangsan@localhost ~]$ sort sort.txt 

apple
black
blue
nick
origin
red
Yjs
[zhangsan@localhost ~]$

执行 sort sort.txt 命令，会对文本内容进行排序。

常用参数

-o 将排序后的文件写入新文件， sort -o sort_sorted.txt sort.txt ；
-r 倒序排序， sort -r sort.txt ；
-R 随机排序， sort -R sort.txt ；
-n 对数字进行排序，默认是把数字识别成字符串的，因此 138 会排在 25 前面，如果添加了 -n 数字排序的话，则 25 会在 138 前面。

wc

word count 的缩写，用于文件的统计。它可以统计单词数目、行数、字符数，字节数等。

基础语法

wc sort.txt # 统计sort.txt

实例用法

[zhangsan@localhost ~]$ wc sort.txt 
 8  7 38 sort.txt
[zhangsan@localhost ~]$

第一个8，表示行数；
第二个7，表示单词数；
第三个38，表示字节数。

常用参数

-l 只统计行数， wc -l sort.txt ；
-w 只统计单词数， wc -w sort.txt ；
-c 只统计字节数， wc -c sort.txt ；
-m 只统计字符数， wc -m sort.txt 。

uniq

删除文件中的重复内容。

基础语法

uniq sort.txt # 去除sort.txt重复的行数，并打印到屏幕上

[zhangsan@localhost ~]$ cat sort.txt 
apple
origin
red
blue
black
nick
Yjs
Yjs
blue
[zhangsan@localhost ~]$ uniq sort.txt 
apple
origin
red
blue
black
nick
Yjs
blue
[zhangsan@localhost ~]$ 


uniq sort.txt uniq_sort.txt # 把去除重复后的文件保存为 uniq_sort.txt

【注意】它只能去除连续重复的行数。

常用参数

-c 统计重复行数， uniq -c sort.txt ；
-d 只显示重复的行数， uniq -d sort.txt 。

cut

剪切文件的一部分内容。

基础语法

cut -c 2-4 sort.txt # 剪切每一行第二到第四个字符

[zhangsan@localhost ~]$ touch notes.csv
[zhangsan@localhost ~]$ vi notes.csv 
[zhangsan@localhost ~]$ ls
aa.txt  bb  notes.csv  sort.txt
[zhangsan@localhost ~]$ cat notes.csv 
Mark1,951/100,很不错1
Mark2,952/100,很不错2
Mark3,953/100,很不错3
Mark4,954/100,很不错4
Mark5,955/100,很不错5
Mark6,956/100,很不错6
[zhangsan@localhost ~]$ cut -d , -f 1 notes.csv 
Mark1
Mark2
Mark3
Mark4
Mark5
Mark6
[zhangsan@localhost ~]$

常用参数

-d 用于指定用什么分隔符（比如逗号、分号、双引号等等） cut -d , sort.txt ；
-f 表示剪切下用分隔符分割的哪一块或哪几块区域， cut -d , -f 1 sort.txt 。

2.重定向、管道、流

在 Linux 中一个命令的去向可以有3个地方：终端、文件、作为另外一个命令的入参。

命令一般都是通过键盘输入，然后输出到终端、文件等地方，它的标准用语是 stdin 、 stdout 以及 stderr 。

标准输入 stdin ，终端接收键盘输入的命令，会产生两种输出；
标准输出 stdout ，终端输出的信息（不包含错误信息）；
标准错误输出 stderr ，终端输出的错误信息。

重定向

把本来要显示在终端的命令结果，输送到别的地方（到文件中或者作为其他命令的输入）。

输出重定向 `>`

> 表示重定向到新的文件， cut -d , -f 1 notes.csv > name.csv ，它表示通过逗号剪切 notes.csv 文件（剪切完有3个部分）获取第一个部分，重定向到 name.csv 文件。

我们来看一个具体示例，学习它的使用，假设我们有一个文件 notes.csv ，文件内容如下：

[zhangsan@localhost ~]$ touch notes.csv
[zhangsan@localhost ~]$ vi notes.csv 
[zhangsan@localhost ~]$ ls
aa.txt  bb  notes.csv  sort.txt
[zhangsan@localhost ~]$ cat notes.csv 
Mark1,951/100,很不错1
Mark2,952/100,很不错2
Mark3,953/100,很不错3
Mark4,954/100,很不错4
Mark5,955/100,很不错5
Mark6,956/100,很不错6

执行命令： cut -d , -f 1 notes.csv > name.csv 最后输出如下内容：

[zhangsan@localhost ~]$ cut -d , -f 1 notes.csv > name.csv
[zhangsan@localhost ~]$ ls
aa.txt  bb  name.csv  notes.csv  sort.txt
[zhangsan@localhost ~]$ cat name.csv 
Mark1
Mark2
Mark3
Mark4
Mark5
Mark6
[zhangsan@localhost ~]$

【注意】使用 > 要注意，如果输出的文件不存在它会新建一个，如果输出的文件已经存在，则会覆盖。因此执行这个操作要非常小心，以免覆盖其它重要文件。

输出重定向 `>>`

表示重定向到文件末尾，因此它不会像 > 命令这么危险，它是追加到文件的末尾（当然如果文件不存在，也会被创建）。

再次执行 cut -d , -f 1 notes.csv >> name.csv ，则会把名字追加到 name.csv 里面。

[zhangsan@localhost ~]$ cut -d , -f 1 notes.csv >> name.csv
[zhangsan@localhost ~]$ cat name.csv 
Mark1
Mark2
Mark3
Mark4
Mark5
Mark6
Mark1
Mark2
Mark3
Mark4
Mark5
Mark6
[zhangsan@localhost ~]$

我们平时读的 log 日志文件其实都是用这个命令输出的。

输出重定向 `2>`

标准错误输出

cat not_exist_file.csv > res.txt 2> errors.log

[zhangsan@localhost ~]$ cat ssss
cat: ssss: 没有那个文件或目录
[zhangsan@localhost ~]$ cat ssss > res.txt
cat: ssss: 没有那个文件或目录
[zhangsan@localhost ~]$ cat ssss > res.txt 2> errors.log
[zhangsan@localhost ~]$ ls
aa.txt  bb  errors.log  name.csv  notes.csv  res.txt  sort.txt
[zhangsan@localhost ~]$ cat errors.log 
cat: ssss: 没有那个文件或目录
[zhangsan@localhost ~]$

当我们 cat 一个文件时，会把文件内容打印到屏幕上，这个是标准输出；
当使用了 > res.txt 时，则不会打印到屏幕，会把标准输出写入文件 res.txt 文件中；
2> errors.log 当发生错误时会写入 errors.log 文件中。

输出重定向 `2>>`

标准错误输出（追加到文件末尾）同 >> 相似。

输出重定向 `2>&1`

标准输出和标准错误输出都重定向都一个地方

cat not_exist_file.csv > res.txt 2>&1  # 覆盖输出
cat not_exist_file.csv >> res.txt 2>&1 # 追加输出

[zhangsan@localhost ~]$ cat not.csv > res.txt 2>&1
[zhangsan@localhost ~]$ ls
aa.txt  bb  errors.log  name.csv  notes.csv  res.txt  sort.txt
[zhangsan@localhost ~]$ cat res.txt 
cat: not.csv: 没有那个文件或目录
[zhangsan@localhost ~]$

目前为止，我们接触的命令的输入都来自命令的参数，其实命令的输入还可以来自文件或者键盘的输入。

<表示输出，会覆盖文件原有的内容

<<表示追加，会将内容追加到已有文件的末尾

输入重定向 `<`

< 符号用于指定命令的输入。

[zhangsan@localhost ~]$ cat name.csv   
this is a name Text
[zhangsan@localhost ~]$ cat < name.csv  # 指定命令的输入为 name.csv
this is a name Text
[zhangsan@localhost ~]$

虽然它的运行结果与 cat name.csv 一样，但是它们的原理却完全不同。

cat name.csv 表示 cat 命令接收的输入是 notes.csv 文件名，那么要先打开这个文件，然后打印出文件内容。
cat < name.csv 表示 cat 命令接收的输入直接是 notes.csv 这个文件的内容， cat 命令只负责将其内容打印，打开文件并将文件内容传递给 cat 命令的工作则交给终端完成。

输入重定向 `<<`

将键盘的输入重定向为某个命令的输入。

sort -n << END # 输入这个命令之后，按下回车，终端就进入键盘输入模式，其中END为结束命令（这个可以自定义）
[zhangsan@localhost ~]$ sort -n << END
> apple
> name
> hello
> yjs
> nihao
> not 
> like
> END
apple
hello
like
name
nihao
not 
yjs
[zhangsan@localhost ~]$ 


wc -m << END # 统计输入的单词
[zhangsan@localhost ~]$ wc -m << END
> a d f s f f fas fasd ds fdf sf sdfa
> END
36
[zhangsan@localhost ~]$

管道 `|`

把两个命令连起来使用，一个命令的输出作为另外一个命令的输入，英文是 pipeline ，可以想象一个个水管连接起来，管道算是重定向流的一种。

举几个实际用法案例：

cut -d , -f 1 name.csv | sort > sorted_name.txt 
# 第一步获取到的 name 列表，通过管道符再进行排序，最后输出到sorted_name.txt

du | sort -nr | head 
# du 表示列举目录大小信息
# sort 进行排序,-n 表示按数字排序，-r 表示倒序
# head 前10行文件

grep log -Ir /var/log | cut -d : -f 1 | sort | uniq
# grep log -Ir /var/log 表示在log文件夹下搜索 /var/log 文本，-r 表示递归，-I 用于排除二进制文件
# cut -d : -f 1 表示通过冒号进行剪切，获取剪切的第一部分
# sort 进行排序
# uniq 进行去重

流

流并非一个命令，在计算机科学中，流 stream 的含义是比较难理解的，记住一点即可：流就是读一点数据, 处理一点点数据。其中数据一般就是二进制格式。 上面提及的重定向或管道，就是把数据当做流去运转的。

到此我们就接触了，流、重定向、管道等 Linux 高级概念及指令。其实你会发现关于流和管道在其它语言中也有广泛的应用。 Angular 中的模板语法中可以使用管道。 Node.js 中也有 stream 流的概念。

3.文件解压缩

打包：是将多个文件变成一个总的文件，它的学名叫存档、归档。
压缩：是将一个大文件（通常指归档）压缩变成一个小文件。

我们常常使用 tar 将多个文件归档为一个总的文件，称为 archive 。然后用 gzip 或 bzip2 命令将 archive 压缩为更小的文件。

tar

创建一个 tar 归档。

常用参数

-c 建立一个压缩文件的参数指令（create）
-x 解开一个压缩文件的参数指令（extract）
-z 是否需要用 gzip 压缩
-v 压缩的过程中显示文件（verbose）
-f 使用档名，在 f 之后要立即接档名（file）
-cvf 表示 create（创建）+ verbose（细节）+ file（文件），创建归档文件并显示操作细节；
-tf 显示归档里的内容，并不解开归档；
-rvf 追加文件到归档， tar -rvf archive.tar file.txt ；
-xvf 解开归档， tar -xvf archive.tar 。

基础用法

tar -cvf sort.tar sort/ # 将sort文件夹归档为sort.tar
[zhangsan@localhost ~]$ ls
aa.txt  errors.log  notes.csv  sort_note.txt  test
bb      name.csv    res.txt    sort.txt       www.txt
[zhangsan@localhost ~]$ tar -cvf ss.tar test/
test/
test/b.txt
test/c.txt
test/a.txt
[zhangsan@localhost ~]$ ls
aa.txt  errors.log  notes.csv  sort_note.txt  ss.tar  www.txt
bb      name.csv    res.txt    sort.txt       test
[zhangsan@localhost ~]$ 


tar -cvf archive.tar file1 file2 file3 # 将 file1 file2 file3 归档为archive.tar
[zhangsan@localhost test]$ touch a.txt b.txt c.txt
[zhangsan@localhost test]$ ls
a.txt  b.txt  c.txt
[zhangsan@localhost test]$ tar -cvf ss.tar a.txt b.txt c.txt 
a.txt
b.txt
c.txt
[zhangsan@localhost test]$ ls
a.txt  b.txt  c.txt  ss.tar
[zhangsan@localhost test]$

gzip / gunzip

“压缩/解压”归档，默认用 gzip 命令，压缩后的文件后缀名为 .tar.gz 。

gzip archive.tar # 压缩
gunzip archive.tar.gz # 解压
[zhangsan@localhost ~]$ ls
aa.txt  errors.log  notes.csv  sort_note.txt  ss.tar  www.txt
bb      name.csv    res.txt    sort.txt       test
[zhangsan@localhost ~]$ gzip ss.tar 
[zhangsan@localhost ~]$ ls
aa.txt  errors.log  notes.csv  sort_note.txt  ss.tar.gz  www.txt
bb      name.csv    res.txt    sort.txt       test
[zhangsan@localhost ~]$ gunzip ss.tar.gz 
[zhangsan@localhost ~]$ ls
aa.txt  errors.log  notes.csv  sort_note.txt  ss.tar  www.txt
bb      name.csv    res.txt    sort.txt       test
[zhangsan@localhost ~]$

tar 归档+压缩

可以用 tar 命令同时完成归档和压缩的操作，就是给 tar 命令多加一个选项参数，使之完成归档操作后，还是调用 gzip 或 bzip2 命令来完成压缩操作。

tar -zcvf archive.tar.gz archive/ # 将archive文件夹归档并压缩
tar -zxvf archive.tar.gz # 将archive.tar.gz归档压缩文件解压

zcat、zless、zmore

之前讲过使用 cat less more 可以查看文件内容，但是压缩文件的内容是不能使用这些命令进行查看的，而要使用 zcat、zless、zmore 进行查看。

zcat archive.tar.gz

zip/unzip

“压缩/解压” zip 文件（ zip 压缩文件一般来自 windows 操作系统）。

命令安装

# Red Hat 一族中的安装方式
yum install zip 
yum install unzip

基础用法

unzip archive.zip # 解压 .zip 文件
unzip -l archive.zip # 不解开 .zip 文件，只看其中内容

zip -r sort.zip sort/ # 将sort文件夹压缩为 sort.zip，其中-r表示递归

微信关注

编程那点事儿

阅读剩余

作者：理想

链接：https://www.imyjs.cn/archives/832

文章版权归作者所有，未经允许请勿转载。

THE END

【Linux系统】常用命令：群组用户以及文件权限管理、查找文件、Linux的软件仓库

<<上一篇

【Linux系统】常用命令：进程与网络

下一篇>>

【Linux系统】常用命令：文本操作、重定向、管道、流、文件打包解压缩

1.文本操作

grep

基础语法

常用参数

高级用法

sort

基础语法

实例用法

常用参数

wc

基础语法

实例用法

常用参数

uniq

基础语法

常用参数

cut

基础语法

常用参数

2.重定向、管道、流

重定向

输出重定向 >

输出重定向 >>

输出重定向 2>

输出重定向 2>>

输出重定向 2>&1

输入重定向 <

输入重定向 <<

管道 |

流

3.文件解压缩

tar

常用参数

基础用法

gzip / gunzip

tar 归档+压缩

zcat、zless、zmore

zip/unzip

命令安装

基础用法

微信关注

输出重定向 `>`

输出重定向 `>>`

输出重定向 `2>`

输出重定向 `2>>`

输出重定向 `2>&1`

输入重定向 `<`

输入重定向 `<<`

管道 `|`