everydayExcel Newsletter Header

How I manage 150 millions rows of data with Excel

In one of the projects that I worked on, the client showed me an Excel file with about 300,000 rows of data. The data was collected by some gadgets and downloaded into Excel, the most common media to manage data. Since it is collected by gadgets, most of us would have assumed that the data collected is perfect and can be analysed straight away. But that is far from the truth. Gadgets can malfunction for various reasons, lack of power, imperfect installation, breakdown in data collection process, etc. This means that I would have to remove the imperfect data in the data set. Inperfect data consist of errors, zeros, empty readings, incorrect dates. It took me about 15 mins to clean up 300K of rows of data. This is done using functions like auto-filter, find all, goto special and some formulas.

To clean the file completely, I took about 15 mins. That is no big deal at all. But it was a big deal if there were 600 files with similar conditions to clean up. If a file is going to take me 15 mins, then 4 files will take 1 hour, 40 files 10 hours, 400 files 100 hours, 600 files 150 hours. 150 hours was eqivalent to 6 over days, working non stop for 24 hours a day. I wouldn't want to stare at the screen for more than an hour without rest. And I couldn't guanrantee that I would do a perfect clean up after staring at the computer screen for more than one hour. I can distribute the job to more people. To complete 600 files in one day, I will need about 20 people working the full 8 hours day without breaks. This sounds viable if I and only if I can get 20 people to work on the files instantly and they know what you are trying to achieve. Instead of trying to find 20 people to work on this, I chose the the option to create a macro to process the files. It took me about 1 hour to create the simple macro, using macro recording and then modify the recording to save time. Then I run it on the files.

The macro is able to automatically process the files in the folder one by one, open, clean up the data, save and close the file. All it takes to process the file is about 1 min, much faster than any human being and highly accurate as well. So for 600 files, it will take about 600 minutes or 10 hours to complete. Of course, it takes time to build the macros and how fast you can build the macros depends on how experience you are. Learning it and put it into practice will make the skill come in handy at the critical moment. But it was a worthy investment to pick up macro. I manage to shorten a 150 hours job to 10 hours, achieving 90% savings on this excel project. Not only that, I do not have to employ 20 people to work on the file. That is another savings of $1,600 (20 x $80). So if you are looking to shorten your overtime hours or emply less people, learn Macro/Excel VBA. It is worth picking up the skill.

Ask us an Excel question and get your question answered by our team of Excel experts

Click on this link to post your question.

Upcoming Courses

Skills Futures CreditFor Individuals - If you are a Singaporean aged 25 and above, you are given $500 in your newly created Skills Future Credit Account. The credit in the account is meant for you to pay for WDA approved courses for the purpose of upgrading yourself without company sponsors. You can pay for our courses using your Skills Future Credit.

Hidden Secrets in Data Analysis with Excel  - 7 & 8 May 2018 (Limited Seats)

Business Analytics Reporting with PowerPoint 2010 - 4 & 5 Jun 2018 (confirmed)

The Hidden Key to Excel Reporting and Charting - 14 & 15 May 2018 (confirmed)

Unleash the Power of Excel PowerPivot for Data Insights - 24 & 25 May 2018 (confirmed)

Learning the Magic of Macros 2007 (Excel VBA) - 11 & 12 Jun 2018 (confirmed)

Breakthrough Performance for HR in Excel 2007/2010 - 9 & 10 Jul 2018 (confirmed)

For the details of the courses, please refer to our Excel courses page.

Subscribe to our newsletter

If you received this newsletter from a friend and would like to be included in our mailing list, please go to our Excel Today's page.

Follow us at Facebook, LinkedIn

If you have a facebook account, you can like us at facebook or linkedln. From facebook or LinkedIn, you will be notified on mini cases and ideas how we use Excel on a day to day basis. These tips may not be published in our newsletter.

Yours Sincerely,

Jason Khoo
Chief Trainer
everydayExcel Business Lab Pte Ltd
Author of "Hidden Secrets of Data Analysis with Excel" and 
"Excel Secrets for Highly Effective Marketers"


Received this newsletter from a friend? You can request to be included in our mailing list by signing up at our Excel Today's page. If you have friends whom you think might be interested in this newsletter, feel free to send it to them.