tidyverse
研究人员必备良器
R
packages
首先,让我们学习一下研究人员必知必会的R基本工具包:tidyverse
[@tidyverse-2]
1 Setup
2 基本操作:
2.1 选择已有变量(行): select
2.2 筛选样本(列): filter
2.3 生成新变量: mutate
# A tibble: 6 × 14
name height mass hair_color skin_color eye_color birth_year sex gender
<chr> <int> <dbl> <chr> <chr> <chr> <dbl> <chr> <chr>
1 Luke Sky… 172 77 blond fair blue 19 male mascu…
2 C-3PO 167 75 <NA> gold yellow 112 none mascu…
3 R2-D2 96 32 <NA> white, bl… red 33 none mascu…
4 Darth Va… 202 136 none white yellow 41.9 male mascu…
5 Leia Org… 150 49 brown light brown 19 fema… femin…
6 Owen Lars 178 120 brown, gr… light blue 52 male mascu…
# ℹ 5 more variables: homeworld <chr>, species <chr>, films <list>,
# vehicles <list>, starships <list>
代码
starwars %>%
select(gender,mass,height,species) %>% # 选择变量会有提示,提高输入效率
filter(species == "Human") %>%
na.omit() %>% # 去掉NA数据
mutate(height = height / 100, # 这里使用相同变量名,则会替换掉原变量
BMI = mass / height^2) %>% # 这里使用不同的变量名,则会新生成一个变量
summarise(Average_BMI = mean(BMI),.by = gender) # tidyverse升级后,group_by 可以通过.by实现
# A tibble: 2 × 2
gender Average_BMI
<chr> <dbl>
1 masculine 25.7
2 feminine 20.8
2.4 教学视频
3 分类命令:case_when()
case_when 命令用于将数据按一定条件进行分类。
3.1 导入样本数据
3.2 演示
代码
# A tibble: 34 × 3
name score grade
<chr> <dbl> <chr>
1 student 1 80 B
2 student 2 66 D
3 student 3 72 c
4 student 4 75 c
5 student 5 74 c
6 student 6 71 c
7 student 7 77 c
8 student 8 49 F
9 student 9 66 D
10 student 10 84 B
# ℹ 24 more rows
代码
# A tibble: 6 × 6
name species height hair_color skin_color eye_color
<chr> <chr> <int> <chr> <chr> <chr>
1 Luke Skywalker Human 172 blond fair blue
2 Leia Organa Human 150 brown light brown
3 Owen Lars Human 178 brown, grey light blue
4 Beru Whitesun Lars Human 165 brown light blue
5 Biggs Darklighter Human 183 black light brown
6 Anakin Skywalker Human 188 blond fair blue
代码
# A tibble: 6 × 5
name species height mass BMI
<chr> <chr> <dbl> <dbl> <dbl>
1 Luke Skywalker Human 1.72 77 26.0
2 C-3PO Droid 1.67 75 26.9
3 R2-D2 Droid 0.96 32 34.7
4 Darth Vader Human 2.02 136 33.3
5 Leia Organa Human 1.5 49 21.8
6 Owen Lars Human 1.78 120 37.9
# A tibble: 6 × 3
genus order sleep_total
<chr> <chr> <dbl>
1 Myotis Chiroptera 19.9
2 Eptesicus Chiroptera 19.7
3 Lutreolina Didelphimorphia 19.4
4 Priodontes Cingulata 18.1
5 Didelphis Didelphimorphia 18
6 Dasypus Cingulata 17.4
代码
# A tibble: 10 × 6
name species height hair_color skin_color eye_color
<chr> <chr> <int> <chr> <chr> <chr>
1 Luke Skywalker Human 172 blond fair blue
2 C-3PO Robot 167 <NA> gold yellow
3 R2-D2 Robot 96 <NA> white, blue red
4 Darth Vader Human 202 none white yellow
5 Leia Organa Human 150 brown light brown
6 Owen Lars Human 178 brown, grey light blue
7 Beru Whitesun Lars Human 165 brown light blue
8 R5-D4 Robot 97 <NA> white, red red
9 Biggs Darklighter Human 183 black light brown
10 Obi-Wan Kenobi Human 182 auburn, white fair blue-gray
代码
# A tibble: 2 × 3
sex mean_height mean_mass
<chr> <dbl> <dbl>
1 female 1.72 54.7
2 male 1.78 80.2