Testing Hardware
Hardware components can start malfunctioning at any time. But the malfunctioning will not always be easily noticeable, since it might not always result in a BSOD or some other obvious error. Sometimes it takes significant damage to happen for you to notice the hardware failure during regular use—and if this happens when you’re not prepared, it can be very unpleasant.
For this reason, it’s recommended that you test the hardware components every once in a while, to expose any hidden defects. You should first of all test them when they’re brand new, since a lot of the times hardware comes with a small defect out of the factory. It goes without saying that you should test them if you're overclocking or undervolting. Running these tests too often is not recommended, though, and you should get an idea of which components are more likely to develop errors over time. For example, with normal use, storage (HDD, SDD) or RAM are more likely to start malfunctioning than the CPU.
And obviously testing hardware is recommended any time you notice operating system instabilities, since those are very often caused by faulty hardware.
General guidelines
Testing specific components requires specific tools, which are listed below. At this point, let me warn you that stress testing the CPU, GPU and RAM can produce high temperatures, so make sure you have adequate cooling. You should also keep an eye on the temperatures with monitoring software, such as HWiNFO or HWMonitor.
Stress testing can also cause system instability and if this happens—and unless it’s caused by overheating, which is simply a result of poor cooling—it usually means there’s a hardware defect. So, don’t do stress testing when you’re running important tasks (having unsaved documents open etc.) and ideally don’t run any other software at all.
Testing CPU
- LinX – best for testing the CPU core stability. Run it with a lot of RAM, for a couple of hours or at least 20 passes.
- Prime95 – best for testing the integrated memory controller. Run the "In-place large FFTs" test for a few hours or ideally overnight.
Testing RAM
- Memtest86+ – best for testing all of the system's RAM. Boot it from a CD or a USB and run it for several passes, ideally overnight.
- HCI Memtest – will spot some errors that Memtest86+ might not, especially when the problem is that the RAM voltage is slightly insufficient etc. The free version only runs from within Windows, so you won’t be able to test all of the RAM, but it’s still very useful. Run as many instances as you have CPU cores, close all the other programs and assign as much RAM as you can. It might also be a good idea to disable the pagefile during testing, so that it doesn't slow down the process. For stability, I’d recommend at least 300% coverage, ideally 1000%.
- Windows Memory Diagnostic Tool – included in Windows, tests the RAM outside of Windows after a reboot. You can find it in Administrative Tools. It will launch after a reboot and then you can select different tests by pressing F1. If no errors are found you'll get a popup notification after logging into Windows. But you can also check the results in the Event Viewer (under Windows Logs → System).
Testing graphics
- OCCT – can spot some very small errors. Test the GPU for at least 20 minutes, ideally for an hour. Sometimes the errors it shows can be fixed with driver updates, so do that first if you see them and test again.
Testing drives
- Using the tool provided by the manufacturer. Look at the manufacturer’s website and see if they offer a tool, often it’s something that you have to boot from CD or USB.
- HD Tune – check the Health status, which will give you a S.M.A.R.T. report. And then test with "Error Scan". "Quick scan" is not thorough enough, the slow scan will give you more reliable information.
- HDDScan – click on the New Task icon, select Surface Tests, then Add Test (Read or Verify). Double clicking on a test in the Test Manager will show the graphical progress.