C++字符串算法[编辑 | 编辑源代码]

C++字符串算法是处理和分析文本数据的基础工具集，涵盖从简单搜索到复杂模式匹配的操作。标准库 `<algorithm>` 和 `<string>` 提供了丰富的函数，结合迭代器可实现高效字符串处理。本专题将系统讲解核心算法及其应用场景。

核心算法分类[编辑 | 编辑源代码]

字符串算法可分为以下几类：

查找与搜索：子串定位、字符匹配
修改与转换：大小写转换、替换
分割与连接：基于分隔符拆分字符串
比较与验证：字典序比较、格式校验

基础查找算法[编辑 | 编辑源代码]

find() 系列函数[编辑 | 编辑源代码]

在字符串中定位子串或字符的最基础方法：

#include <string>
#include <iostream>

int main() {
    std::string text = "The quick brown fox jumps over the lazy dog";
    size_t pos = text.find("fox");  // 返回首次出现的位置
    
    if (pos != std::string::npos) {
        std::cout << "Found at position: " << pos << "\n";  // 输出: 16
    } else {
        std::cout << "Not found\n";
    }
}

关键点：

返回类型为 `size_t`，失败时返回 `std::string::npos`
变体函数：

 * `rfind()`：反向搜索
 * `find_first_of()`：查找字符集合中任意字符
 * `find_last_not_of()`：查找不在指定集合的最后字符

高级模式匹配[编辑 | 编辑源代码]

正则表达式（C++11起）[编辑 | 编辑源代码]

`<regex>` 库提供完整的正则表达式支持：

#include <regex>
#include <string>

void validate_email(const std::string& email) {
    std::regex pattern(R"([a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,})");
    if (std::regex_match(email, pattern)) {
        std::cout << "Valid email\n";
    } else {
        std::cout << "Invalid email\n";
    }
}

正则表达式组件说明：

`[]`：字符集合
`+`：1次或多次重复
`\.`：转义点号
`{2,}`：至少2次重复

字符串转换[编辑 | 编辑源代码]

数值转换（C++11起）[编辑 | 编辑源代码]

安全类型转换方法：

#include <string>
#include <stdexcept>

void convert_example() {
    std::string num_str = "3.14159";
    try {
        double pi = std::stod(num_str);  // 字符串转double
        int integer = std::stoi("42");   // 字符串转int
    } catch (const std::invalid_argument& e) {
        std::cerr << "Conversion failed: " << e.what() << '\n';
    }
}

实用算法案例[编辑 | 编辑源代码]

单词频率统计[编辑 | 编辑源代码]

结合map容器实现文本分析：

#include <map>
#include <sstream>
#include <algorithm>

void word_counter(const std::string& text) {
    std::map<std::string, int> freq;
    std::istringstream iss(text);
    std::string word;
    
    while (iss >> word) {
        // 转换为小写并统计
        std::transform(word.begin(), word.end(), word.begin(), ::tolower);
        ++freq[word];
    }
    
    // 输出结果
    for (const auto& [w, count] : freq) {
        std::cout << w << ": " << count << '\n';
    }
}

性能优化技巧[编辑 | 编辑源代码]

1. 预留空间：使用 `reserve()` 预分配内存避免重复分配

   std::string large_str;
   large_str.reserve(100000);  // 预分配100KB

2. 移动语义：C++11后优先使用移动而非复制

   std::string process_text(std::string&& input) {
       // 处理输入字符串
       return std::move(input);  // 移动而非复制
   }

3. 视图模式：C++17的 `string_view` 避免不必要的复制

   void analyze(std::string_view text) {
       // 只读操作无需复制原字符串
   }

复杂度分析[编辑 | 编辑源代码]

常见操作的时间复杂度（n为字符串长度）：

时间复杂度对照表
操作	复杂度
`find()`/`rfind()`	O(n)
`substr()`	O(n)
`operator+`	O(n+m)
`push_back()`	平均O(1)

数学表示： $O (n)$ 表示线性时间复杂度

进阶话题[编辑 | 编辑源代码]

Unicode处理：使用ICU库处理多语言文本
并行算法：C++17的并行执行策略加速处理
字符串压缩：结合zlib等库实现高效存储

练习题目[编辑 | 编辑源代码]

1. 实现反转字符串的函数，要求原地修改 2. 编写检测回文字符串的算法（忽略大小写和标点） 3. 设计字符串压缩算法（如"aaabbbcc"→"a3b3c2"）

通过系统掌握这些算法，开发者可高效解决90%以上的文本处理需求。建议结合实际问题进行针对性练习，逐步深入理解各算法的适用场景和优化方法。