Jump to content

RTX 2080 BSOD in Ark


turbodonkey

Recommended Posts

7 hours ago, Erethor said:

So at least for those with the RTX series, try turning shadow quality as low as it goes. That suggestion was made earlier in this thread, but just reiterating it again because 18 pages is a bit long. It does not get rid of the problem, but for some users with the RTX it reduces the BSOD frequency to once per a few hours rather than once every 5 minutes.

That doesn't even work on GTX 1080 Ti cards, it reduces the frequency only.

Link to comment
Share on other sites

  • Replies 661
  • Created
  • Last Reply
15 hours ago, CyberAngel67 said:

Hey, I am have seen this error 3 times now and I know it is Ark related as it only happens on Ark. But the BSOD indicates it is not Ark.

 

BSOD DPC_WATCHDOG_VIOLATION

The computer has rebooted from a bugcheck.  The bugcheck was: 0x00000133 (0x0000000000000001, 0x0000000000001e00, 0xfffff802dbe5d378, 0x0000000000000000). 

That is a Windows 10 issue. Not related to Ark.

From Microsoft:

The DPC_WATCHDOG_VIOLATION bug check has a value of 0x00000133. This bug check indicates that the DPC watchdog executed, either because it detected a single long-running deferred procedure call (DPC), or because the system spent a prolonged time at an interrupt request level (IRQL) of DISPATCH_LEVEL or above. The value of Parameter 1 indicates whether a single DPC exceeded a timeout, or whether the system cumulatively spent an extended period of time at IRQL DISPATCH_LEVEL or above. DPCs should not run longer than 100 microseconds and ISRs should not run longer than 25 microseconds, however the actual timeout values on the system are set much higher.

Link to comment
Share on other sites

10 hours ago, Erethor said:

So at least for those with the RTX series, try turning shadow quality as low as it goes. That suggestion was made earlier in this thread, but just reiterating it again because 18 pages is a bit long. It does not get rid of the problem, but for some users with the RTX it reduces the BSOD frequency to once per a few hours rather than once every 5 minutes.

So far it looks like it is the way DX11 handles variable shadow lighting is not correctly being handled in the driver with the current firmware. Lowering and disabling shadows will help but it is a band aid. Not a fix. 

Link to comment
Share on other sites

6 hours ago, TruWrecks said:

That is a Windows 10 issue. Not related to Ark.

From Microsoft:

The DPC_WATCHDOG_VIOLATION bug check has a value of 0x00000133. This bug check indicates that the DPC watchdog executed, either because it detected a single long-running deferred procedure call (DPC), or because the system spent a prolonged time at an interrupt request level (IRQL) of DISPATCH_LEVEL or above. The value of Parameter 1 indicates whether a single DPC exceeded a timeout, or whether the system cumulatively spent an extended period of time at IRQL DISPATCH_LEVEL or above. DPCs should not run longer than 100 microseconds and ISRs should not run longer than 25 microseconds, however the actual timeout values on the system are set much higher.

Yes, but I only get it with Ark. I know what that says and the crash dump leads to a 3rd party driver.

And the way it happens is identical the this BSOD thread, with all the changes and recommendations I am getting this BSOD instead.

And all the research I have done on this particular BSOD, shows that Nvidia has been the culprit many times over the years. 

Link to comment
Share on other sites

From what I've researched, this is the problem as it currently stands:

 

The BAD_POOL_CALLER BSOD issue is caused by a problem with Nvidia's RTX enabled drivers. If you have a GTX card (10xx series or earlier) you can solve the issue by reverting to driver version 399.xx or earlier. If you have an RTX card, you can reduce the frequency of the problem by lowering shadows as much as possible or eliminate the problem by launching with DX10 graphics which are significantly less pretty but stable.

This problem is caused by a bad interaction between the engine and the RTX/400+ drivers, it's a low-level problem that would likely take an infeasible amount of effort for WIldcard to fix on their end, like rebuilding the game on a different engine. And so all we can really do is what I said above and wait for Nvidia to sort the issue out and keep watch for new driver releases.

Link to comment
Share on other sites

9 minutes ago, TheQuasars said:

From what I've researched, this is the problem as it currently stands:

 

The BAD_POOL_CALLER BSOD issue is caused by a problem with Nvidia's RTX enabled drivers. If you have a GTX card (10xx series or earlier) you can solve the issue by reverting to driver version 399.xx or earlier. If you have an RTX card, you can reduce the frequency of the problem by lowering shadows as much as possible or eliminate the problem by launching with DX10 graphics which are significantly less pretty but stable.

This problem is caused by a bad interaction between the engine and the RTX/400+ drivers, it's a low-level problem that would likely take an infeasible amount of effort for WIldcard to fix on their end, like rebuilding the game on a different engine. And so all we can really do is what I said above and wait for Nvidia to sort the issue out and keep watch for new driver releases.

If people are on a 10 Series Nvidia card, reverting to a driver with 399.xxx. Will not fix this issue, it still happens. It might not be as frequent, but it certainly does happen.

Link to comment
Share on other sites

On 12/16/2018 at 3:33 AM, TheQuasars said:

From what I've researched, this is the problem as it currently stands:

 

The BAD_POOL_CALLER BSOD issue is caused by a problem with Nvidia's RTX enabled drivers. If you have a GTX card (10xx series or earlier) you can solve the issue by reverting to driver version 399.xx or earlier. If you have an RTX card, you can reduce the frequency of the problem by lowering shadows as much as possible or eliminate the problem by launching with DX10 graphics which are significantly less pretty but stable.

This problem is caused by a bad interaction between the engine and the RTX/400+ drivers, it's a low-level problem that would likely take an infeasible amount of effort for WIldcard to fix on their end, like rebuilding the game on a different engine. And so all we can really do is what I said above and wait for Nvidia to sort the issue out and keep watch for new driver releases.

 

No, they don't need to "rebuild the whole game on a new engine" they ONLY need to upgrade the unreal engine to the latest, reapply only SOME of their already completely crap and stupid way of dealing with certain things to the unreal engine (or hire a real professional that can do things without changing the engine for them ... because I fully understand that when you're great at something art related doesn't mean you have to be good at programming and this game shows that the team was at least not good at both...) 

When one doesn't know how to do something doesn't mean the tool they're using is bad (technically speaking they can argue that the tool was bad since the problem comes from deep inside the unreal engine... a fixed problem nonetheless) 

The huge benefit EVERYONE would get if they'd update the game to the latest is well ... they'd be using the proper latest code and can update every time there's a change in the unreal engine ... because they're using the same engine with all the crappy fubar changes they made for ALL of their games ... and if we don't insist for them to hire proper professionals (I'm fully aware how much professionals cost, and how hard is to find them) then we're going to be in serious trouble for all the games they make. You can't expect this team to make another game the same way they made ark... this piece of garbage code they put on top of unreal should have not been there in the first place and allow the bloody engine to be upgraded at any time. FUTURE PROOF is the 101 of using a f.ing library !!!

If you need changes to how a certain thing works THINK TWICE whether you really need that change and if you still do think that way ... then hire a professional to think for you. And then, if things don't look good still, then ask the person that made the library to try and do that for you... (people at unreal are really nice people that would do a lot of nice things if you paid them enough). And THEY should be THE ONLY people that make changes to the bloody library ! NOT YOU!!! the fact that you know how an "if" or a "for" works doesn't make you programmers and it shouldn't be nVidia that needs to make exceptions on their driver because there's a single game in the world that hasn't updated unreal engine or any other library for that matter. The next thing we know is these guys will ask expect Intel or Amd to change how their cpus work.

The problem still happens for me with shadows on lowest level. 

Link to comment
Share on other sites

On 12/18/2018 at 3:09 AM, Vaiks said:

It's beyond belief that this issue is still an issue but it tell a lot about what kind of company WC really is.

it's telling about how companies don't give a flying raptor after they sold a game and why i prefer going for monthly payment games. this is what used to happen ALL THE TIME for ALL THE GAMES that were ever released ... there's a crapton of bugs ? well tough luck ... the game doesn't work at all once you did this or that ... well even more tough luck ... 

but when it's an "oups we're losing 90% of ours subscribers because everybody and their dog upgraded their video card" it's two choices: "You never get upgraded graphics because upgrade means risk" or "we care about our customers enough that we keep putting effort into fixing the bugs and keeping things amazingly tidy"

If you want a real world example look at blizzard and why it became popular and look at how it works now ;)

it's facts and I know it because I'm a bloody game developer myself. I'm just trying to state that there are some bugs which even the worst companies fix ... the ones that stop the game from working altogether ... and especially so around Xmas ... unless you don't care that steam purchases can be refunded :D

I was expecting WC to be the wiser about this matter ... people get new video cards , and the people who get the amazing video cards are exactly the people who buy your game :) if there's any bug you should cater to with more urgency is for getting these exact people happy ;) these are the ones who bring you money for your greedy stakeholders ...

There're two categories of people who you should be your top priority in fixing the bugs for: the ones who are die hard fans and the rich idiots who have expensive video cards. Both of these categories of people are the ones who feed your children and put coffee on your table. Everyone else matters too but if you don't fix poop for these two categories ... well you might not have a job very soon :D

if you want proof look at the many companies that died because they forgot this rule.. hell look at how Activision Blizzard or Bethesda Softworks are on their last breath now ;)

Don't be in the bad press for your own good sake ... you've been there once wildcard ... 

Link to comment
Share on other sites

8 hours ago, TruWrecks said:

THIS IS NOT AN ARK ISSUE!

It is a driver and DirectX 11 issue! Nvidia and Microsoft need to fix it!

 

If you want it fixed please complain to the people that can do something to fix it!

Wrong. This was a Unreal Engine issue. WildCard as usually is being wildly unaware while dealing with this game breaking bugs. Dont you think its little strange that there are tons of other UE titles and they are not giving BSOD's like this title.

Link to comment
Share on other sites

6 hours ago, thebobmaster said:

Wrong. This was a Unreal Engine issue. WildCard as usually is being wildly unaware while dealing with this game breaking bugs. Dont you think its little strange that there are tons of other UE titles and they are not giving BSOD's like this title.

Prove it's an Unreal Engine issue when NVidia has already said it was a driver issue to me on several occasions.

Link to comment
Share on other sites

5 hours ago, thebobmaster said:

I may have missed your source where you are mentioning this. Please upload the redacted reply from Nvidia technical staff admitting that it is their driver's issue. I am playing multiple unreal engine games and none of them is exhibiting BSOD. No BSOD at all.

Unreal Engine 4.5 uses calls to the DirectX 11 API through the video driver for variable shadows. This is why disabling shadows reduces crashes. With a 1080 this is not a major issues. On an RTX the driver doesn't know where to send the command or how to error handle the results and the driver stops responding. The Windows tries to monitor the driver and when it detects the driver has stopped Windows tries to restart the driver. This puts the driver into a reset loop and Windows panics and causes a BSOD.

Setting the TrdDelay to a value higher than 10 prevents the Windows from restarting the driver and lets the driver recover by itself, preventing a driver reset loop and the Windows BSOD.

 

Nvidia either needs to write a driver just for the RTX or update the firmware on every card so it behaves more like a 10 series card to correct this issue.

This is what I have been working with NVidia on since October 9th.

From NVidia on 11/02/2018

Quote
We are working on a fix for that issue now, and we have a driver we have been testing with a small handful of users, if you would like to try it and let me know if it helps please download it here:
 
<link removed>
 
Thanks
 
Josh
NVCC

 

Link to comment
Share on other sites

On 12/15/2018 at 5:33 PM, TheQuasars said:

From what I've researched, this is the problem as it currently stands:

 

This problem is caused by a bad interaction between the engine and the RTX/400+ drivers, it's a low-level problem that would likely take an infeasible amount of effort for WIldcard to fix on their end, like rebuilding the game on a different engine.

 

This is absolute BS. Stop repeating it. I understand very well that Ark uses a fairly old version of UE4 that is highly customized. This does not make upgrading a huge ordeal. The Unreal Engine is designed to be easily upgraded to new versions and allow for ALL customization to be back ported into the new version. An UE4 dev told me exactly this and said anyone claiming otherwise is lying to you.

Link to comment
Share on other sites

16 hours ago, TruWrecks said:

Nvidia either needs to write a driver just for the RTX or update the firmware on every card so it behaves more like a 10 series card to correct this issue.

This is what I have been working with NVidia on since October 9th.

From NVidia on 11/02/2018

 

Except that the 10 Series of Card has the same exact issue, reducing the shadows doesn't stop it from entering a BSOD either. Nor does dropping back to the recommended 399.24 (did I get that right) driver version.

So how do you expect them to make it behave more like the 10 Series cards, when the 10 Series cards has the same issue.

Link to comment
Share on other sites

I wish there was something besides going back and forth with conjecture we could do, to help speed things along.  Ultimately DX10 appears to be mostly solid, in my personal experience.  It still has some issues but in 40 hours of game time, I maybe had 2 lockups with DX10 while DX11 is just worthless.  Again for reference, this is a RTX 2080 ti.  Dropping the shadows when in DX 11 "appeared to help" I had a great session lasting 5+ hours.  Then some crashes hit and I was never able to recover again from it.  I reinstalled drivers, verified the game files, did a happy chkdsk just for the heck of it, erased the gpu cache.  Never got the stability back...this was on driver 417.22 or so.  I haven't tried it again on DX 11 since that one.  DX 10 looks like total arse on TheCenter. 

So I have seen a few references to TdrDelay.  Would something like this help?  

https://answers.microsoft.com/en-us/windows/forum/all/registry-fix-for-videotimeoutdetectionandrecovery/deea25d5-d94a-47f8-a09f-be7ce56d5084

I haven't tried just setting TdrDelay but that seems like a very comprehensive set of adjustments.  Was kicking around giving it a go this weekend.  I'm just always worried about losing a dino from a crash. :)  Yes, I'm worried about losing a fake asset in a fake world *cough*

Link to comment
Share on other sites

No conjecture needed.

 I am a map developer that uses UE4. I have done extensive testing with NVidia over the past two months. I have a 1080 Ti and a 2080 Ti installed in the same machine. I can switch the monitor to either card and enable either card on demand for testing.

 I made a test map for NVidia that has the crashing issue and it is only containing the core UE4 assets that Ark has available. On this map in the Ark Dev Kit if I run the 1080 Ti it does not crash. When I use the 2080 Ti the Dev Kit crashed within minutes.

 Hundreds of hours of testing on this map with both cards boils it down to a difference in the card architecture. The 1080 is a Pascal GPU based card. 2080 is a Turing base card. Both GPU architectures have some similarities but there are elements in the Turning GPUs that are very different.

Ark sends requests to DirectX for every video based demand that the environment demands. The driver has the job of translating the DirectX 11 requests and formatting it into a suitable structure for the target hardware. The GPU gets the request and needs to calculate the values based on the requirement DirectX has sent.

With the Pascal GPU DirectX commands are processed and a confirmation is returned to DirectX. This response is then sent to Ark. Occasionally a memory register is triggered that causes a crash. This could be a faulty video card or one that is overclocked and is beginning to fail. If bad data is return from the video card it causes Windows to error and BSOD.

With a Turning GPU the DirectX commands are being sent back but not all in a timely manner. Something in the firmware of the Turning GPU is causing a delay. This delay caused the driver to stop responding and restart itself. Windows is detecting that the driver has stopped responding and it attempts to restart it but the driver is still restarting itself and this causes Windows to panic and throw up an error and BSOD.

NVidia has been trying to solve this issue with a video driver update. This may need to be addressed in a firmware patch to correct the cause of the delay.

This is what I have been testing and working on since the 2080 Ti launch. I now have over 200 hours of testing and reading error logs from UE and from several games that do and don't work correctly with the RTX.

This is not a trivial issue to solve. Nvidia is trying every possible driver fix before any firmware testing will begin. Upgrading firmware on a video card can be very risky and is a last resort. If the firmware fails the card is bricked and must be sent back to NVidia to be flashed by their support staff. This can be a very costly process so it is approached only after all other options are exhausted.

In short, if a driver can be written to correctly talk to both architectures we will be good once NVidia finds that magic code. Otherwise a firmware update will be needed for every Turing GPU that has already been shipped and sold. That is a lot of customers to contact and not everyone registers their product for warranty.

Sorry for the novel.

Link to comment
Share on other sites

3 hours ago, TruWrecks said:

This is what I have been testing and working on since the 2080 Ti launch. I now have over 200 hours of testing and reading error logs from UE and from several games that do and don't work correctly with the RTX.

 

Thanks for the thorough testing. So have you found any other UE based game throwing this BSOD?

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.


×
×
  • Create New...