5 Best Practices for Building a CSV Uploader

Table of Contents

This is some text inside of a div block.

Launch CSV import in 30 minutes

Companies like Ramp, Toast, Scale AI, and Vanta trust OneSchema to provide a delightfully guided spreadsheet import experience for their customers.

Request a demo

Uploading a CSV file is a common capability in web applications across every industry. However, it's also one of the most notoriously frustrating experiences for most applications. Having a delightful CSV upload experience can set you apart from your competitors and make the right first impression with your customers.

The best CSV uploaders are intuitive, performant, and give the customer ample opportunity to correct issues without leaving your product. Whether your customers are uploading contacts, users, or financial data, the same core principles apply to developing your CSV upload experience.

Building a delightful upload experience is a deceptively complicated undertaking. Without proper planning, teams run into numerous delays. Proper project planning and awareness of the edge cases of CSV upload will lead to a much smoother launch.

‍

CSV Upload: Build vs. Buy?

Engineering teams are often faced with the decision to build technology in-house vs leveraging a pre-built SaaS solution. OneSchema is an embeddable CSV uploader that allows your engineering team to launch a delightful data upload feature in minutes instead of months. However, leveraging a hosted solution is not the best option for every engineering team.

‍

Advantages of Building CSV Upload

Full control over design and features to fit your use case
Ability to update the software to adapt to changing internal and external needs
No recurring 3rd party vendor fees

‍

Advantages of Using a Hosted Upload Tool

Longer initial launch timeline
Ongoing maintenance cost
Skipping advanced features due to time constraints (10 advanced data import features you won’t launch yourself)

‍

5 Steps to Building a Delightful CSV Uploader

So you’ve decided to build your own CSV uploader. Here are 5 steps to ensuring you get the experience right for your customers.

‍

1. Define your data model(s)

First, define what data you want to collect from your customers. Consider the following questions when defining your data model:

What columns of data will my customers be uploading?
What data type validations should be run on each column?
What other validations, if any, should be run on the data?
Do I expect my customer to upload data in a single file or multiple files?
Do I want my customer to be able to create columns that don’t exist in my system (custom fields) during their upload experience?
Do I need to validate data against my database or a 3P data source? Will this impact the performance of my upload experience?
Will the data uploaded vary by customer (see also: template overrides)?
How big are the typical files my customers will be uploading?
How many columns are being uploaded?

From here, you can define your data template: the specification for the files, columns, data types, and validations for the data customers will be uploading.

A data template is a description of your data model's columns and required validations

‍

2. Design your user experience

‍

Understanding your desired workflow

There are several other nuances of your workflow that will influence the design of your experience. You should also consider questions like:

How important is conversion to my CSV upload experience? Should my uploader require all data to be perfectly clean before allowing any data to be uploaded?
What kinds of errors do I expect my customers will have in their data? Will the person performing the upload be able to resolve the errors, or will they need to ask others on their team for support?
Will customers upload more data in the same format in the future?
Do I want my customer to be able to return to an abandoned upload session?
Are there certain types of common errors I want to be able to resolve on behalf of my customer (e.g. dates formats, capitalization, etc.)?
How do I want to educate customers about the expected data to be uploaded?
How do I want to explain to customers errors in the data?
Does my column mapping experience need to accommodate a very large number of columns (20+)?

The answers to these questions will determine if you want to invest in features like accepting partial data submissions, advanced error resolution workflows, saved mappings, session resume and user education components.

‍

What are the common steps in a CSV upload interface?

There are typically 4 steps in a CSV file upload interface

‍

1. File upload

A component where your user can drag-and-drop a file or use a file picker to select a file to upload. It can be helpful to include user education on this pane to explain to your customers what data you'd like them to upload.

‍

2. Header row selection (optional)

Allows your user to select which row of data is the header. We only recommend implementing this step if your customers frequently have data where the header is not the first row. The vast majority of our customers choose to skip this pane.

‍

3. Column mapping

Allows your user to map columns from their uploaded CSV file to your column headers. A nice-to-have feature on this pane is suggested column mappings, typically based on fuzzy matching or historical uploads.

‍

4. Data review

Shows your customer errors in the file and assists the customer with data conversion

‍

3. Choose a CSV parsing library

The CSV format is unfortunately not as consistent as you might imagine. CSV file parsing is one of the most notoriously bug-prone aspects of CSV import (right behind data validation). While there are great open-source libraries out there, none are perfect and will likely require you to build and maintain an edge-case handling layer on top. Over time as your customers encounter errors, your parsing layer will become more robust. Common issues you may encounter with CSV parsing libraries include special characters, misplaced delimiters, and performance.

OneSchema is built in Rust, so we use rust-csv for parsing. Here are a few other libraries we recommend:

papaparse. papaparse is a popular node.js library. It is extensively maintained by its creator and the open source community.
csv-parser. csv-parser is built for performance and is a great choice if you're expecting to work with large files.
fast-csv. fast-csv is indeed fast, but is actually one of the slower libraries. It however has a few advanced features like callbacks to modify headers and transform rows.

‍

4. Build data validation logic

Typically, the most time consuming aspect of building and maintaining a CSV importer is building and maintaining your data validation logic.

There are typically 3 types of validations you may need in your CSV importer:

Data type validations For each column, you’ll likely want to validate that the data conforms to your data type specification. For example, you may want to validate that all numbers are formatted with two digits past the decimal point and are written without commas or that all URLs end with a valid TLD. Data type validations are often achieved via regular expressions.
Multi-column validations You may need to implement more complex business logic that involves data from multiple columns in a record. Common validations in this category include referential integrity and mathematical validations.
Database or 3P validations You may want to validate that uploaded data is valid given information in an existing database or from a 3P data source. Common use cases for database validations include de-duplication and referential integrity.

OneSchema supports an out-of-the-box library of validations for CSV files that can be used with or without OneSchema’s CSV import UI. Read more here about the benefits of leveraging a pre-built data validation library.

‍

5. Launch & Iterate

Once you’ve defined and implemented your user experience, you’re ready to launch your new CSV upload feature!

Home-rolled CSV uploaders often take several months to roll out fully as there are often edge-cases that are difficult to debug without live user traffic. For example, it will be difficult to improve your CSV parsing library without seeing some unusual files or catch edge cases in your data validation logic. Data mapping is another common step that requires extensive testing, especially if you implement logic to attempt to automatically map data for your customer.

We recommend rolling out your CSV importer to a subset of beta users and doing a few user research sessions to make sure customers understand how to use the feature. We also frequently see developers implement a rollback feature for bad imports as they can be quite painful to reverse manually.

Most teams also choose to publish a support page on how to use their CSV import tool. Unless it has been impeccably designed, it's a common area for support requests and having a FAQ with common points of confusion can save your support team a lot of time.

‍

Conclusion

CSV uploaders can vary drastically in complexity depending on the specifics of your data model and use case. They can take anywhere from 2 weeks to 6 months (or more!) to launch and typically involve engineering, product, and design.

It can be helpful also to review the CSV import experience in other SaaS solutions as you may notice design patterns that would work well for your customers. Your competitor's data upload experiences will be especially helpful as they are likelier to have similar data and customers. However, there is no substitute for user testing as there are often nuances to your data model that you will need to clarify with your users.

Especially if your data upload experience is on a critical flow (such as customer onboarding), we highly recommend leaving ample time for testing. If you have any questions about the design of your experience, we are more than happy to share our expertise! Feel free to reach out to us at support@oneschema.co

‍

CSV Upload: Build vs. Buy?

‍

Advantages of Building CSV Upload

Full control over design and features to fit your use case
Ability to update the software to adapt to changing internal and external needs
No recurring 3rd party vendor fees

‍

Advantages of Using a Hosted Upload Tool

Longer initial launch timeline
Ongoing maintenance cost
Skipping advanced features due to time constraints (10 advanced data import features you won’t launch yourself)

‍

5 Steps to Building a Delightful CSV Uploader

So you’ve decided to build your own CSV uploader. Here are 5 steps to ensuring you get the experience right for your customers.

‍

1. Define your data model(s)

First, define what data you want to collect from your customers. Consider the following questions when defining your data model:

What columns of data will my customers be uploading?
What data type validations should be run on each column?
What other validations, if any, should be run on the data?
Do I expect my customer to upload data in a single file or multiple files?
Do I want my customer to be able to create columns that don’t exist in my system (custom fields) during their upload experience?
Do I need to validate data against my database or a 3P data source? Will this impact the performance of my upload experience?
Will the data uploaded vary by customer (see also: template overrides)?
How big are the typical files my customers will be uploading?
How many columns are being uploaded?

From here, you can define your data template: the specification for the files, columns, data types, and validations for the data customers will be uploading.

‍

2. Design your user experience

‍

Understanding your desired workflow

There are several other nuances of your workflow that will influence the design of your experience. You should also consider questions like:

How important is conversion to my CSV upload experience? Should my uploader require all data to be perfectly clean before allowing any data to be uploaded?
What kinds of errors do I expect my customers will have in their data? Will the person performing the upload be able to resolve the errors, or will they need to ask others on their team for support?
Will customers upload more data in the same format in the future?
Do I want my customer to be able to return to an abandoned upload session?
Are there certain types of common errors I want to be able to resolve on behalf of my customer (e.g. dates formats, capitalization, etc.)?
How do I want to educate customers about the expected data to be uploaded?
How do I want to explain to customers errors in the data?
Does my column mapping experience need to accommodate a very large number of columns (20+)?

‍

What are the common steps in a CSV upload interface?

There are typically 4 steps in a CSV file upload interface

‍

1. File upload

‍

2. Header row selection (optional)

‍

3. Column mapping

‍

4. Data review

Shows your customer errors in the file and assists the customer with data conversion

‍

3. Choose a CSV parsing library

OneSchema is built in Rust, so we use rust-csv for parsing. Here are a few other libraries we recommend:

papaparse. papaparse is a popular node.js library. It is extensively maintained by its creator and the open source community.
csv-parser. csv-parser is built for performance and is a great choice if you're expecting to work with large files.
fast-csv. fast-csv is indeed fast, but is actually one of the slower libraries. It however has a few advanced features like callbacks to modify headers and transform rows.

‍

4. Build data validation logic

Typically, the most time consuming aspect of building and maintaining a CSV importer is building and maintaining your data validation logic.

There are typically 3 types of validations you may need in your CSV importer:

Data type validations For each column, you’ll likely want to validate that the data conforms to your data type specification. For example, you may want to validate that all numbers are formatted with two digits past the decimal point and are written without commas or that all URLs end with a valid TLD. Data type validations are often achieved via regular expressions.
Multi-column validations You may need to implement more complex business logic that involves data from multiple columns in a record. Common validations in this category include referential integrity and mathematical validations.
Database or 3P validations You may want to validate that uploaded data is valid given information in an existing database or from a 3P data source. Common use cases for database validations include de-duplication and referential integrity.

‍

5. Launch & Iterate

Once you’ve defined and implemented your user experience, you’re ready to launch your new CSV upload feature!

‍

5 Best Practices for Building a CSV Uploader

CSV Upload: Build vs. Buy?

Advantages of Building CSV Upload

Advantages of Using a Hosted Upload Tool

5 Steps to Building a Delightful CSV Uploader

1. Define your data model(s)

2. Design your user experience

Understanding your desired workflow

What are the common steps in a CSV upload interface?

1. File upload

2. Header row selection (optional)

3. Column mapping

4. Data review

3. Choose a CSV parsing library

4. Build data validation logic

5. Launch & Iterate

Conclusion

Subscribe to receive our newsletter

Continue reading

Meet the team: Melissa Loh

Meet the team: Annie Na

Announcing OneSchema FileFeeds: Automate your CSV Imports

Make data imports effortless and efficient

Request a demo

5 Best Practices for Building a CSV Uploader

Request a demo

CSV Upload: Build vs. Buy?

Advantages of Building CSV Upload

Advantages of Using a Hosted Upload Tool

5 Steps to Building a Delightful CSV Uploader

1. Define your data model(s)

2. Design your user experience

Understanding your desired workflow

What are the common steps in a CSV upload interface?

1. File upload

2. Header row selection (optional)

3. Column mapping

4. Data review

3. Choose a CSV parsing library

4. Build data validation logic

5. Launch & Iterate

Conclusion

CSV Upload: Build vs. Buy?

Advantages of Building CSV Upload

Advantages of Using a Hosted Upload Tool

5 Steps to Building a Delightful CSV Uploader

1. Define your data model(s)

2. Design your user experience

Understanding your desired workflow

What are the common steps in a CSV upload interface?

1. File upload

2. Header row selection (optional)

3. Column mapping

4. Data review

3. Choose a CSV parsing library

4. Build data validation logic

5. Launch & Iterate

Conclusion

Continue reading

Meet the team: Melissa Loh

Meet the team: Annie Na

Announcing OneSchema FileFeeds: Automate your CSV Imports

Continue reading

Meet the team: Melissa Loh

Meet the team: Annie Na

Request a demo