In-depth Analysis of the Hardware Watchdog Function in Serial Device Server: Self-Recovery within 10 Seconds after Program Crash
In today's rapid development of the Industrial Internet of Things (IIoT), the serial device server serves as a core hub connecting traditional equipment to digital networks, and its stability directly determines the continuity of production lines and the reliability of data transmission. However, due to factors such as electromagnetic interference, software vulnerabilities, or hardware aging, the program of a serial device server may get stuck in an infinite loop or become unresponsive, leading to device offline status, data loss, or even production accidents. The hardware watchdog (Watchdog Timer, WDT), as the "safety guardian" of embedded systems, has become a key technology for ensuring stable device operation by automatically detecting program anomalies and triggering system resets. This article will provide an in-depth analysis of the working principle and implementation mechanism of the hardware watchdog and, using the practical application case of the USR-N520 serial device server, explore how to achieve self-recovery within 10 seconds after a program crash.
The hardware watchdog is an independent timer circuit whose core function is to monitor the operating status of the main program. When the system is running normally, the program needs to regularly send a "watchdog feed signal" (such as writing a specific value or triggering an interrupt) to the watchdog to reset the timer. If the program fails to feed the watchdog on time due to an infinite loop, runaway, or resource exhaustion, the watchdog timer will force a reset signal output after timeout, restarting the system. This mechanism achieves a leap from "manual intervention after a fault occurs" to "automatic repair before a fault occurs," significantly improving system availability.
Typical Case: A serial device server deployed by an automobile manufacturing enterprise experienced a program crash due to electromagnetic interference. Without a watchdog configured, the device remained offline for up to 2 hours, causing a production line halt. After introducing the hardware watchdog, the fault recovery time was shortened to within 10 seconds, and annual downtime losses were reduced by 90%.
| Characteristic | Hardware Watchdog | Software Watchdog |
| Independence | Independent hardware circuit, unaffected by the main program | Depends on the main program operation and may be incorrectly turned off |
| Response Speed | Microsecond-level reset with no delay | May have delays as it depends on main program scheduling |
| Anti-interference Ability | Strong resistance to electromagnetic interference and power supply fluctuations | Susceptible to program errors or resource competition |
| Typical Application Scenarios | Industrial control, vehicle-mounted equipment, medical electronics | Consumer electronics, scenarios with low reliability requirements |
Tout = Tfeed × (N + 1)
where Tfeed is the watchdog feed interval (usually set to 1 second), and N is the number of allowed watchdog feed failures (for example, when N = 9, Tout = 10 seconds).
2.2 Watchdog Feed Strategy: Dual-level Verification Mechanism
To avoid mistaken resets, a "hardware + software" dual-level verification mechanism should be adopted:
c
// Initialize the watchdog timer (timeout time of 10 seconds)voidWDT_Init(void){SysTick->LOAD=10000;// Assuming a system clock of 1 MHz, 10,000 counts = 10 msSysTick->VAL=0;SysTick->CTRL=SysTick_CTRL_ENABLE_Msk|SysTick_CTRL_TICKINT_Msk;}// Watchdog feed operation (called regularly by the main program)voidFeed_Watchdog(void){heartbeat_flag=1;// Update the heartbeat flag bit}// Watchdog interrupt service routinevoidSysTick_Handler(void){staticuint32_tmiss_count=0;if(heartbeat_flag==0){miss_count++;if(miss_count>=1000){// 10 ms × 1000 = 10 secondsNVIC_SystemReset();// Trigger a system reset}}else{heartbeat_flag=0;// Clear the flag bitmiss_count=0;}}2.3 Post-reset State Recovery: Seamless Integration of Business Logic
After the system is reset, it needs to quickly recover to the operating state before the fault to avoid data loss or business interruption:
| Indicator | Evaluation Standard |
| Watchdog Type | Prioritize hardware watchdogs to avoid the risk of software watchdogs being turned off by the program |
| Reset Time | Require ≤ 10 seconds to ensure rapid fault recovery |
| Operating Temperature Range | Industrial-grade equipment should support -40°C to 85°C to adapt to outdoor or extreme environments |
| EMC Protection Level | At least comply with standards such as IEC 61000-4-2 (ESD) and IEC 61000-4-4 (EFT) |
| Protocol Support | Support protocols such as TCP/UDP/MQTT/HTTP to be compatible with existing systems |
Dual Watchdog: Hardware watchdog + software watchdog provide dual protection.
High-reliability Design: Cortex-M7 core, deeply optimized TCP/IP protocol stack, and dual Socket backup.
Ease of Use: Support Web configuration, AT commands, and transparent transmission to cloud platforms, reducing deployment thresholds.
Ecosystem Support: Provide complete development documentation, SDKs, and 7×24-hour technical support.
In the era of the Industrial Internet of Things, the stability of equipment directly determines the competitiveness of enterprises. The hardware watchdog provides a guarantee of "immortality" for serial device servers through automated and real-time fault recovery mechanisms. With its industrial-grade design, dual watchdog mechanism, and 10-second self-recovery capability, the USR-N520 has become the preferred solution for industries such as power, transportation, and energy.
Immediate Consultation: Click to obtain the detailed technical specification sheet of the USR-N520, customized solutions, and free sample testing, enabling your equipment to bid farewell to crashes and achieve 7×24-hour stable operation!