SRX Services Gateway
Highlighted
SRX Services Gateway

SRX 380 stuck in coredump reboot loop!

[ Edited ]
2 weeks ago

I rebooted my SRX 380 running on Junos 19.3R2-S3.4 and it seems to be stuck in the following reboot loop because of an in progress coredump. How can I get out of this and force the box to start up ? Also, how can I prevent this from happening again ?

root@srx380> request system reboot
Reboot the system ? [yes,no] (no) yes

Shutdown NOW!
[pid 5800]


*** FINAL System shutdown message from root@poc-srx380-4 ***

System going down IMMEDIATELY


root@poc-srx380-4> Jul 27 15:54:07 init: dialer-services (PID 2062) exited with status=0 Normal Exit
Jul 27 15:54:07 init: neighbor-liveness (PID 2057) exited with status=0 Normal Exit
Jul 27 15:54:07 init: internal-routing-service (PID 2056) exited with status=0 Normal Exit
Jul 27 15:54:07 init: firewall (PID 2055) exited with status=0 Normal Exit
Jul 27 15:54:07 init: periodic-packet-servicCoredump of a process in progress
Ignoring watchdog timeout during boot/reboot
Coredump of a process in progress
Ignoring watchdog timeout during boot/reboot
Coredump of a process in progress
Ignoring watchdog timeout during boot/reboot
Coredump of a process in progress

 

This goes on and on, and the box never actually reboots.

Thanks for any help!

4 REPLIES 4
Highlighted
SRX Services Gateway

Re: SRX 380 stuck in coredump reboot loop!

2 weeks ago

Hi Zippie,

 

I would recommend doing a format install or following the process in KB https://kb.juniper.net/InfoCenter/index?page=content&id=kb10386 as it is not accessing the CLI anymore. 

Highlighted
SRX Services Gateway

Re: SRX 380 stuck in coredump reboot loop!

2 weeks ago

Hey Zippie,

 

 

Greetings, before doing a format Install I would recommend you the following in case you have another SRX:

 

  • Plug a USB drive into a known healthy SRX which is of the same model and configuration as the faulty device. Run the following command:

 

 

>request system snapshot media usb 

 

  • When the snapshot is complete, power down the faulty SRX. Remove the USB from the healthy SRX and plug it into the faulty SRX.


  • Power on the faulty device on and see if it will boot from the USB drive.


  • If it does, then snapshot the image from the USB drive back onto the compact flash (CF) with the following command:

 

>request system snapshot media internal 

 

This will load the image from the USB onto the SRX's internal compact flash.

 

 

  • Then reboot the device from the compact flash. Use the following command:

 

>request system reboot media internal


Note: Taking a snapshot of the healthy device will also take a snapshot of the configuration, and will load the same configuration on the faulty device.

 

Reference: https://kb.juniper.net/InfoCenter/index?page=content&id=KB29811

 

I this does not work you can think of doing a format install

 

https://kb.juniper.net/InfoCenter/index?page=content&id=KB33005&cat=SRX_SERIES&actp=LIST&showDraft=f...

source: https://takab-cng.ir/?exampass=t5/SRX-Services-Gateway/Recover-SRX-from-snapshot-on-USB/td-p/461377

 

 


If this solves your problem, please mark this post as "Accepted Solution" so we can help others too \:)/

 

Regards,

 

Lil Dexx
JNCIE-ENT#863, 3X JNCIP-[SP-ENT-DC], 4X JNCIA [cloud-DevOps-Junos-Design], Champions Ingenius, SSYB

 

Highlighted
SRX Services Gateway

Re: SRX 380 stuck in coredump reboot loop!

2 weeks ago

Hello Zippie,

 

If the device says that "Coredump of a process in progress" but actually if it can't core-dump then it can be a critical issue with the hardware itself and mostly it will throw an error - e.g. ** Dump failed (rc = 5) **. If that's the case then most probably RMA might be required.

However, it would be better if we can carry out the recovery steps as suggested by others.

 

  1. Since the device is going in a boot loop you need to stop it in the loader prompt. Once you did that you need to perform USB re-imaging as mentioned in the KB article - https://kb.juniper.net/InfoCenter/index?page=content&id=KB33005&cat=SRX_SERIES&actp=LIST&showDraft=f...
  2. If the above step doesn't work, then we can try one more recovery method. For this one, you need the working SRX380 available. We will be taking the snapshot of the device from the working SRX via USB and insert the USB in non-working SRX and try to boot it from USB - https://kb.juniper.net/InfoCenter/index?page=content&id=KB29811
  3. If above even doesn't work and if the device stuck in DB> prompt, you give the following commands to recover it. i.e. db> cont and if it doesn't work then db> reset.
  4. If none of the above works, then RMA is the only option.

 

Note: While performing step 1, it would be great if you can download and install recommended Junos version - Junos 20.1R1-S2 if the device is deployed in standalone. Because when the device was launched it came with Junos 20.1R1 by default I believe.



Thanks,
π00bm@$t€®.
Please, Mark My Solution Accepted if it Helped, Kudos are Appreciated too!!!
Highlighted
SRX Services Gateway

Re: SRX 380 stuck in coredump reboot loop!

2 weeks ago

Hi,

 

Please be aware that SRX380 requires at least Junos 20.1R1 or newer to operate. Having it boot on 19.3R2-S3 actually impresses me a bit :slightly_smiling_face:

 

I expect your coredump loop  will be solved upgrading to 20.1 or 20.2 based releases. It can be that you have to do a USB based installed as previously mentioned as I would not trust an upgrade from 19.3R2 to be working on a non-supported platform for that release.


--
Best regards,

Jonas Hauge Klingenberg
Juniper Ambassador & Technology Architect, SEC DATACOM A/S (Denmark)
Feedback