Building a Data Deep State @GVSU - The Role of Faculty Partnerships (Part 1)

In my previous post I discussed my effort to deepen the foundations of data management expertise here at GVSU by making sure that my role as our de-facto data librarian is "embedded" closely with our Office of Sponsored Programs (OSP). This is to help ensure that we broaden the window of engagement with faculty as they pursue federal funding for their research. As they aim to put in place sustainable data archiving and sharing solutions, my job is to work with them quickly and closely to document those solutions in a two-page data management plan (DMP).

That post built upon the first, wherein I documented the additional work I am doing to perform "in-reach" training with our liaison librarians, enlisting them to work more closely with me over the course of their routine engagements with faculty, drawing attention to the need for DMPs and good data management practices.

Adding to these two efforts of "embedding" and "in-reach" I am also pursuing a third strategy of "outreach"--partnering with key teaching faculty on campus to deliver targeted training and awareness raising on issues of good data management.

This is kicking off with our first Data Carpentry Workshop on campus!

On December 18th-19th, the Annis Water Research Institute (AWRI) and GVSU Libraries--with generous support from the Center for Scholarly and Creative Excellence (CSCE)--are co-hosting a GVSU Data Carpentry Workshop focusing on ecology curriculum. Taught by Auriel Fournier (Mississippi State University) and Chris Hamm (Monsanto Company), faculty from AWRI and from other departments on campus will get hands-on training in the use of Excel and other spreadsheet applications to curate raw data and then pipe it through tools like OpenRefine, R, RStudio, and SQLite.

This is a huge professional development boost for these faculty researchers!

And a really great engagement opportunity for the Libraries! To have such a richly-vetted and refined set of resources and experts to be brought onto campus fills a critical gap in our current outreach capacity. As much as I would love to be a deep expert in my own right when it comes to data, the reality is that I have to spread my professional development across a number of different areas, which limits me from getting too specific on discipline/domain levels when it comes to curating data. This is where the power of Data Carpentry comes in.


This workshop would not be happening without the initiative and drive of our teaching faculty--in this case Charlyn Patridge from AWRI. Charlyn approached the Libraries this past February through her liaison contact (Matt Ruen) with the proposal to collaborate on planning and soliciting support for the workshop. A newer faculty on campus, she nonetheless recognized the Libraries as a key partner for advancing data initiatives on campus. Matt Ruen immediately put her in touch with me, and Charlyn and I quickly got to work. We began by brainstorming the best Data Carpentry curriculum set given the needs that Charlyn perceived (in this case Ecology), and given the requirements from Data Carpentry in terms of technology needs and workshop best practices, discussed how best to scope attendance and arrange for on-the-ground logistics.

Having our teaching faculty driving initiatives like this makes it easy to get the attention of support centers on campus like our Center for Scholarly and Creative Excellence (CSCE), who were more than happy to provide funding for our two Data Carpentry instructors--their travel, their stipends, and accommodations. All things considered it was an easy sell--Data Carpentry is 1) time-tested and proven; 2) well deployed at institutions all over the world; and 3) highly sustained and reusable. For an extremely reasonable set of fees the Data Carpentry Workshops are an incredible investment.

So, here we are on our first day. I was able to put in a great plug for data management consultation through the Libraries and hand out my contact information so that all of our faculty in attendance know how to reach out after the workshop.


Auriel then kicked us off, walking us step-by-step through a series of best practices for standardizing, stabilizing, and manipulating pesky date-specific information in Excel. She then scaffolded our work from Excel to the use of comma separated value (CSV) files as a stepping stone towards using OpenRefine to take our data cleanliness to the next level. Auriel frequently gave a nod to the benefits of doing good curation in order to ensure the preservation and longevity of data--to which I am indebted.


This was all to set the stage for Chris Hamm's Intro to R Studio in the afternoon, where we learned the basics of using R as a program language via R Studio's console and script handling functions. The goal here was to learn some of R's powerful methods and learn some basic syntax. We're here today to get the lay of the land so that we can take things to the next level when we come back to session tomorrow. R really is its own language and can have a bit of a steep learning curve--Chris is great at being patient, concise, systematic, and thorough with his instruction.

I'm pleased to say that this room is filled with engrossed, engaged, and completely enthusiastic faculty from biology, ecology, statistics, engineering...and yes the Libraries! Not only am I learning the ins-and-outs of these incredibly powerful data curation and analysis tools, I am getting insights into the real world challenges that our faculty face with their data and the questions they ask as they seek to move their skill sets forward.

In my next post I'll share some details about another phase of outreach taking place in our winter semester--this time making use of great resources from the Center for Open Science. Stay tuned!