Welcome to Data Wrangling with R! In this book, I will help you learn the essentials of preprocessing data leveraging the R programming language to easily and quickly turn noisy data into usable pieces of information. Data wrangling, which is also commonly referred to as data munging, transformation, manipulation, janitor work, etc. can be a painstakenly laborious process. In fact, its been stated that up to 80% of data analysis is spent on the process of cleaning and preparing data. However, being a prerequisite to the rest of the data analysis workflow (visualization, analysis, reporting), it's essential that you become fluent and efficient in data wrangling techniques.
This book will guide you through the data wrangling process along with give you a solid foundation of working with data in R. My goal is to teach you how to easily wrangle your data, so you can spend more time focused on understanding the content of your data via visualization, analysis, and reporting. By the time you finish reading this book, you will have learned:
- How to work with the different types of data such as numerics, characters, regular expressions, factors, and dates
- The difference between the different data structures and how to create, add additional components to, and how to subset each data structure
- How to acquire and parse data from locations you may not have been able to access before such as web scraping
- How to develop your own functions and use loop control structures to reduce code redundancy
- How to use pipe operators to simplify your code and make it more readable
- How to reshape the layout of your data, and manipulate, summarize, and join data sets.
In essence, you will have the data wrangling toolbox required for modern day data analysis.