How to not drift away when reading academic papers

Whether you like it or not, reading academic papers is a big part of research projects. The most general and agreed-upon theories taught during undergraduate studies are published in books that are mostly easier to read. However, more specified and cutting-edge knowledge is only available in papers. Don’t get me wrong, I don’t hate reading papers, as they mostly are within the topics I am interested in. However, if I have to read many papers at once, or if the article is poorly written, it might become a bit dreary. Some times, I read over the same paragraph multiple times without actually interpreting the text at all. It’s like part of my brain is just turned off, and I get sleepy because of it. If this sounds familiar to you, then I might have a solution for you.

The current article in paper layout

In my experience, I manage to maintain concentration when reading a paper if it is read aloud to me at the same time. Since the same text is fed to me both visually and by audio, multiple processing lanes in my brain are activated to interpret the text. While this isn’t exactly scientifically proven, it works for me (but it might not for you). Now, except if you’re young or rich enough to have someone read it to you, we are going to use some text-to-speech program as our reader.

Many text-to-speech engines are available for free. Most people will probably think of the read-aloud function in something like Google Translate. The major problem is that academic papers almost exclusively come in the form of PDF’s. If you ever have tried copying text from a PDF, you’ll know that it may be a little frustrating, as line breaks are often added to each new line of text, and header and footer text often are inserted in between continuous text divided over two text boxes. We can remove these line breaks and other anomalies manually or with some program, of course. But the extra effort required for these operations just isn’t worth it.

Copying text from a PDF often results in non-continuous paragraphs

I’ve found an easy-to-use solution in Naturalreaders.com. The Online Reader, which you can install to your computer as a PWA, provides a field where you can paste in any text or even drag-and-drop a PDF file into the field. The reader will automatically detect continuous text fields such that paragraphs will be read without tiny unnatural interruptions. Additionally, you can click on any sentence to initiate reading from that point, and the reading speed can be altered to match your reading speed. The only issue is that the free voices available sound very robotic, and does not provide a comfortable experience to listen to. The next step is to find natural-sounding text-to-speech voices.

User interface of NaturalReader with a PDF file

If you’re already using the new Microsoft Edge browser, you may have already noticed that some different voices are available to you that sound person-like. It seems that one of the many extra features Microsoft has brought to its Chromium browser is its natural-sounding neural text-to-speech voices powered by Azure. You can test out how these voices sound right here. If you’re on Windows and using Google Chrome as your browser, I would recommend switching over to Microsoft Edge. It is the same browser under the hood since it’s also based on Chromium. This means that it has the same extension and casting support, but with many extra useful features such as extensive privacy settings such as in Firefox and full-page screenshots. Furthermore, it has improved performance compared to Chrome, which means that it consumes less CPU power and RAM. I know that this article has almost turned into an all-out ad for the Edge browser. But just give it a shot. Microsoft Edge is also available on Linux and macOS systems, but the argument is weaker since I’ve got less to no experience in those cases, so your mileage may vary.

Comparison of the available TTS voices within the Google Chrome and Microsoft Edge browsers

Alternatively, if you really don’t want to convert to enlightenment by staying with your current browser, there is another option if you’re on Windows. Before I had improved my life by switching to the one and only Microsoft Edge, I had found another neural text-to-speech voice model that is already installed on your Windows PC. Cortana is a virtual assistant similar to Siri or Google Assistant but made by Microsoft. Its features were limited and recently further reduced by Microsoft due to the lack of users, but its voice is still installed. I’ve made a guide to enable the Cortana voice for general text-to-speech purposes here. However, I cannot promise whether it will work for you.